Practice observability vocabulary used during chaos experiments: experiment dashboards, blast radius metrics, safety thresholds, and chaos scoring.
0 / 5 completed
1 / 5
An SRE says 'the experiment dashboard shows the steady state is holding'. What does this mean?
The experiment dashboard tracks the defined steady-state metrics in real time. 'Steady state is holding' means injected failures are not causing observable degradation beyond the defined thresholds — the system is resilient.
2 / 5
What is a 'blast radius metric' in a chaos experiment?
Blast radius metrics quantify the scope of impact: percentage of affected requests, number of impacted users, or downstream services affected. Monitoring blast radius helps ensure the experiment stays within safe bounds.
3 / 5
What happens when a message says 'experiment paused due to safety threshold'?
Safety thresholds are pre-defined abort conditions. If a metric like error rate or latency exceeds the threshold, the chaos orchestrator automatically halts the experiment to protect users — a critical safety mechanism.
4 / 5
What is a 'chaos experiment trace' in the context of distributed tracing?
Chaos experiment traces are distributed traces collected while failures are injected. They reveal which services degrade under failure, helping teams understand the failure propagation path and validate that circuit breakers and fallbacks work correctly.
5 / 5
What is 'chaos scoring' used for?
Chaos scoring provides a resilience metric over time. Systems score higher as they pass more experiments without exceeding safety thresholds, providing a trend metric for improving system reliability.