3 exercises — read and discuss p50/p95/p99 latency, throughput, error rates, and error budgets. Vocabulary every backend developer and SRE needs.
0 / 3 completed
Performance metrics vocabulary
p50 / median — "Half of requests complete faster than this."
p95 / p99 — "95% / 99% of requests complete within this time." The rest are tail latency.
throughput — requests per second (req/s) or transactions per second (TPS)
error rate — percentage of requests that fail
error budget — how much failure your SLO allows in a given period
1 / 3
An SRE engineer shares this data in a post-mortem. Which sentence most accurately describes the performance profile?
API endpoint: GET /api/search
Requests last 1h: 48,230
Latency percentiles:
p50: 38ms
p75: 85ms
p95: 320ms
p99: 1,240ms
Error rate: 0.12%
Option B is the most professional analysis. It: (1) correctly interprets p50 as the median (not average), (2) translates p95 and p99 into actionable business language ("slowest 5% of requests"), (3) flags the issue — large gap between p50 and p99 is a "tail latency problem", (4) contextualises the impact ("noticeable to a small percentage of users"). Option A incorrectly says "average" — p50 is the median. In performance discussion, never confuse median with average. Option C just reads the numbers back without analysis. Option D is too vague for incident discussion. Key vocab: p50 = "at the 50th percentile" = "half of requests are faster than this", p99 = "99% of requests complete in this time or less".
2 / 3
In a performance review meeting, your team lead says: "Our error rate SLO is 0.5%. We've burned 60% of our error budget in the first week." Which interpretation is correct?
Option B correctly interprets the error budget concept. The SLO (Service Level Objective) of 0.5% defines a budget — the total number of allowed errors in a given time window (usually a month or quarter). Having burned 60% of that budget in the first week means the remaining ~40% should last the remaining 3 weeks — which is mathematically unlikely to hold. This signals the team needs to act. Option A confuses "error budget" with actual error count. Options C and D misread what the error budget is. Vocab: error budget = how much downtime or errors your SLO permits; burn rate = how fast you're consuming the budget; "burning the budget" = consuming allowed errors faster than planned.
3 / 3
A developer describes a recent optimisation result. Which description is most technically precise?
Before: avg response time 850ms, throughput 120 req/s
After: avg response time 210ms, throughput 480 req/s
Option B is the most precise because it: (1) gives the latency improvement as a percentage (75% reduction), (2) uses the precise term "quadrupled" for a 4× improvement (not "4 times faster", which is ambiguous), (3) correctly calls the requests-per-second metric "throughput", (4) provides both before and after values for context. Option A is too vague for a technical audience. "Significantly" without numbers has no meaning in engineering. Option C says "4 times faster" — technically this means the new time is 1/4 of the original, which is not how "faster" is typically used in informal English (it could mean 4× faster = 5× lower latency). Use "quadrupled" and "throughput increased by 4×" to be precise. Option D says "+360 requests per second" which is correct math (480-120=360) but less informative than saying it quadrupled.