Reliability Reporting Vocabulary
5 exercises — Practice the language of reliability reports: availability statements, contributing factors, action items, trends, and translating technical metrics for business stakeholders.
0 / 5 completed
1 / 5
An SRE writes in the weekly reliability report: "We achieved 99.95% availability this week." A colleague asks what "availability" means precisely in this context. Which explanation is most accurate?
In SRE reporting, "availability" is almost always a request-based SLI — the percentage of valid user requests that were successfully served — not a simple uptime check.
The distinction matters when communicating to stakeholders:
• Uptime-based: "The server was up 99.95% of the week" — says nothing about whether requests succeeded
• Request-based: "99.95% of requests were served successfully" — directly reflects user experience
When writing a reliability report, always clarify which definition you are using. The standard in modern SRE practice is the request-based definition, because it corresponds to the SLI that drives the error budget.
Key vocabulary:
• availability — proportion of successful requests to total valid requests; expressed as a percentage
• request-based availability — measures individual requests, not server uptime
• uptime — the fraction of time a server or service was running; less precise than request-based availability
• SLI-based reporting — reliability reporting grounded in the same metrics used for the SLO
The distinction matters when communicating to stakeholders:
• Uptime-based: "The server was up 99.95% of the week" — says nothing about whether requests succeeded
• Request-based: "99.95% of requests were served successfully" — directly reflects user experience
When writing a reliability report, always clarify which definition you are using. The standard in modern SRE practice is the request-based definition, because it corresponds to the SLI that drives the error budget.
Key vocabulary:
• availability — proportion of successful requests to total valid requests; expressed as a percentage
• request-based availability — measures individual requests, not server uptime
• uptime — the fraction of time a server or service was running; less precise than request-based availability
• SLI-based reporting — reliability reporting grounded in the same metrics used for the SLO