Intermediate Numbers & Data #slo #sla #error-budget #sre #devops

SLO, SLA & Error Budget Language

5 exercises on describing reliability targets, error budgets, and burn rates in professional English.

The reliability vocabulary trio

SLI: the measurement — e.g. request success rate
SLO: the internal target — e.g. 99.9% success rate over 28 days
SLA: the external contract — with financial consequences for breach
Error budget: 100% minus the SLO target — the allowed failure window

0 / 5 completed

1 / 5

An SRE says: "We're burning through our error budget at 3x the normal rate." What does this mean?

2 / 5

Which sentence correctly uses SLO vocabulary in a professional context?

3 / 5

A post-mortem reads: "The incident caused 47 minutes of degraded availability, consuming 107% of our monthly error budget." What does this mean?

4 / 5

How would you professionally describe this situation: your service has been above its SLO for 3 consecutive months?

5 / 5

A colleague asks: "What's our MTTR this month?" You know the team had 3 incidents lasting 12, 8, and 4 minutes. What is the correct answer and how do you phrase it?

MTTR — Mean Time to Recovery (or Resolution)

MTTR = the average time to recover from an incident, calculated over a set of incidents.

Calculation:
(12 + 8 + 4) / 3 = 24 / 3 = 8 minutes

MTTR is an average, not a sum or a maximum. This is a common mistake.

Related metrics vocabulary:

Metric	Meaning
MTTD	Mean Time to Detect — how long until the alert fires
MTTR	Mean Time to Recovery — how long until service is restored
MTTF	Mean Time to Failure — average time between failures
MTBF	Mean Time Between Failures — similar to MTTF

Professional phrasing:

"Our MTTR this month was 8 minutes — down from 14 minutes last month."
"We're targeting sub-10-minute MTTR for P1 incidents."