Learn the vocabulary of monitoring latency, traffic, errors, and saturation to gauge service health.
0 / 5 completed
1 / 5
At standup, a dev mentions monitoring a service's latency, traffic, errors, and saturation as the small set of core signals that together capture whether that service is actually healthy. What are these four signals called?
The four golden signals, latency, traffic, errors, and saturation, are the small, well-established set of core signals that together capture whether a service is actually healthy, without needing to monitor every possible metric a service could expose. Monitoring an exhaustive list of every metric with equal priority makes it hard to tell which one actually matters most during an active incident. This focused set of four signals is what gives an on-call engineer a fast, reliable read on a service's health.
2 / 5
During a design review, the team wants a dashboard to prominently surface these four core signals for every service, rather than burying them among dozens of less critical metrics. Which capability supports this?
A dedicated golden-signals dashboard prominently surfaces latency, traffic, errors, and saturation for every service, rather than burying these critical signals among dozens of less important metrics on a generic dashboard. Displaying every metric with equal visual prominence forces an on-call engineer to hunt for the signal that actually matters during a stressful incident. This dedicated, prioritized dashboard is what makes a quick health check during an incident genuinely fast.
3 / 5
In a code review, a dev notices an alert is configured specifically on the saturation signal, catching a resource nearing its capacity limit before it actually causes a visible error. What does this represent?
Proactive alerting on the saturation signal catches a resource nearing its capacity limit, like memory or connection pool usage, before it actually causes a visible error a user would notice. Waiting for a visible error before investigating saturation means the team only reacts after the problem has already started affecting real users. This proactive alert is what turns saturation from a lagging indicator into an early warning signal.
4 / 5
An incident report shows a service ran out of database connections and started failing, but no one had been monitoring saturation, only latency and errors, so the warning signs went unnoticed until the outage was already underway. What practice would prevent this?
Monitoring and alerting on all four golden signals, including saturation, catches a resource nearing its limit before it actually triggers a full outage. Monitoring only a subset, like latency and errors, misses exactly the early warning saturation would have provided in this incident. This complete, four-signal coverage is the whole point of the golden-signals framework, since each signal captures a different, complementary aspect of a service's health.
5 / 5
During a PR review, a teammate asks why the team standardizes on these four particular golden signals instead of just monitoring whichever metrics each individual service team happens to find interesting. What is the reasoning?
A standardized, small set of core signals gives every on-call engineer a fast, consistent way to check any service's health, even one they've never worked on directly. An ad hoc metric set chosen independently by each team varies unpredictably, forcing an engineer to relearn what matters for every different service during an incident. The tradeoff is that a fixed set of four signals may not capture every single nuance specific to an unusual service, so teams often supplement them with a few service-specific metrics.