As you build applications for the Cloud, there are fewer "hard downs" due to "filesystem full", "CPU high", or "process down". However, systems may still behave not as expected - Slow is the new down.
SREs need to pivot to new metrics, ones that reflect the consumption of a service or API and allow us to pivot from reactive to predictive (or even preventative): Latency, Request Rate, Error Rate, Saturation, and Utilization. This is what we call Golden Signals.
------------------------------
Ingo Averdunk
Distinguished Engineer
IBM
------------------------------