5 exercises — practice interpreting ops/sec, throughput numbers, and percentile latency data. Learn to distinguish what benchmark numbers actually measure and how to communicate them honestly.
0 / 5 completed
Key vocabulary for reading benchmark results
"Throughput (ops/sec, req/s, MB/s) — how many operations per unit time; measures capacity."
"p50/p95/p99 — percentile latency; p99 = tail latency (the slowest 1% of requests)."
"Mean vs. median: when mean >> median, slow outliers are skewing the average; prefer percentiles."
"Benchmark conditions matter: single-node, warm cache, no auth — state these when reporting."
"Throughput vs. latency: throughput = how much work; latency = how fast one unit of work completes."
1 / 5
A benchmark report shows: "Throughput: 48,320 ops/sec (p50: 2.1ms, p95: 8.4ms, p99: 31.2ms)." Which interpretation is correct?
Option B is the complete and accurate benchmark reading. Key concepts: (1) ops/sec (operations per second) = throughput — how much work the system does, (2) p50 = 50th percentile — the median: half of requests are faster, half slower, (3) p95 = 95th percentile — 95% of requests complete within this time, (4) p99 = 99th percentile = "tail latency" — the slowest 1% of requests. The critical insight Option B captures: the ratio between p50 (2.1ms) and p99 (31.2ms) is 15x — this is a significant tail latency problem, even though the median looks fast. Option A only reads the p50 as "all requests" — fundamentally wrong. Option C calculates an average of percentiles, which is mathematically incorrect. Option D draws an incomplete conclusion from only the p50.
2 / 5
A benchmark shows your database query takes 450ms at p99. Your SLA requires responses under 500ms. A colleague says "We're fine — we're under the SLA." What's the most accurate assessment?
Option B demonstrates proper benchmark literacy. It recognises that: (1) benchmark results are not the same as production performance — application overhead, network latency, connection pooling, and load increases are additional, (2) 50ms headroom at p99 is not comfortable — under production load, p99 typically degrades, (3) benchmarks are a floor, not a ceiling. Option A reads the number correctly but ignores engineering context. Option C dismisses the risk entirely. Option D misunderstands performance goals: you want p99 to be as low as possible, not to "maximise" by being close to the limit. When reading benchmark results, always ask: "What's not included in this number, and what happens when load increases?"
3 / 5
A load test report shows: "Mean: 45ms, Median: 12ms." What does this distribution tell you?
Option B correctly interprets a skewed latency distribution. This is one of the most important benchmarking concepts: mean vs. median in latency data. Latency distributions in real systems are almost always right-skewed: a few very slow requests (GC pauses, cold caches, lock contention) pull the mean up dramatically. The median (p50) is more representative of what most users experience. When mean >> median, you have outliers — and those outliers are real user experiences. Option A treats the mean as the typical case — incorrect for skewed data. Option C ("mean is more reliable") is the opposite of best practice for latency data. Option D treats a statistically common pattern as an error. Rule: for latency analysis, always use percentiles (p50, p95, p99), not means.
4 / 5
A benchmark report says "Throughput: 12,000 req/s (single-node, warm cache, no authentication)." How should you present this result to a product manager?
Option B is the honest, professional way to communicate benchmark results to non-technical stakeholders. The key elements: (1) clearly names the benchmark conditions (single-node, warm cache, no auth) — these are optimistic conditions that don't reflect production, (2) explicitly says "real-world performance will be lower," (3) commits to a more realistic test. Option A strips the conditions and presents the number as a fact — this is how benchmark results get misunderstood and poor architecture decisions get made. Option C compares to "industry standards" without knowing the full context. Option D applies an arbitrary safety factor rather than actually testing under production conditions. Always attach benchmark conditions to benchmark numbers — without them, the number is meaningless.
5 / 5
Which sentence correctly describes what "throughput" measures in a benchmark?
Option B is the complete, accurate definition of throughput. The essential distinction: (1) throughput = rate of work = how many operations per second/minute/hour, (2) latency = time per operation = how long one operation takes. These are related but distinct — you can have high throughput with acceptable latency, or low latency with low throughput. Examples help: "5,000 transactions per second" (payment processing throughput) or "200MB/s" (storage throughput). Option A confuses throughput with latency. Option C conflates the two metrics — a common misunderstanding. Option D describes total count over a run, not a rate — throughput is always a rate (per unit time), not a total. Tip: when reading benchmarks, always note the unit: "ops/sec", "req/s", "MB/s", "QPS" (queries per second) are all throughput measures.