The key tension: Throughput and latency often trade off. Batching increases throughput but increases latency for individual items. Parallelism can improve both, but adds complexity.
Professional framing:
"We optimised for throughput at the cost of p99 latency."
"Latency is within SLO, but we need to increase throughput to handle the upcoming load."
"At high throughput, latency degrades — the bottleneck is the connection pool."
2 / 5
A storage benchmark reports: "Sequential read: 3,500 MB/s. Random read: 82,000 IOPS at 4KB block size." What does IOPS measure and why is it different from MB/s?
IOPS vs. MB/s — when each metric matters
IOPS = I/O Operations Per Second — counts individual read/write operations
Critical for: databases, transaction processing, random access workloads
A database reading individual rows does thousands of small, random 4KB reads
High IOPS, low transfer size per operation
MB/s (throughput) = bytes transferred per second
Critical for: video streaming, bulk data exports, backup jobs, sequential file reads
Few large sequential operations per second
Low IOPS, high transfer size per operation
The difference in the benchmark:
82,000 IOPS at 4KB = 82,000 × 4KB = ~320 MB/s for random reads
3,500 MB/s sequential = much higher throughput but far fewer individual operations
Professional vocabulary:
"This workload is IOPS-bound — sequential throughput is irrelevant here."
"The database requires [X] IOPS — this storage tier can only deliver [Y]."
"NVMe SSDs provide both high IOPS and high sequential throughput."
3 / 5
A benchmark report includes: "Results measured after a 30-second warm-up. Baseline: v2.3.1. Regression threshold: 10%." What does 'regression threshold' mean in this context?
Regression threshold in benchmarks
A performance regression is when new code is measurably slower than the baseline (previous version). A regression threshold defines when the difference is significant enough to act on.
In this benchmark:
Baseline = v2.3.1 performance (the reference point)
Regression threshold = 10%
If the new version is >10% slower on any key metric, the benchmark fails and blocks the release
Why a threshold matters: Benchmarks have natural variance — running the same test twice might give 3% different results due to OS scheduling, cache state, or GC pressure. A 10% threshold ensures only real regressions are flagged, not measurement noise.
Common regression detection vocabulary:
"The benchmark detected a 15% latency regression in v2.4.0 — above the 10% threshold."
"CI/CD pipeline blocked: throughput regression of [X%] detected compared to baseline."
"No regression detected — all metrics within the 5% threshold."
"We bisected to the commit that introduced the regression."
4 / 5
A benchmark shows jitter of 8ms on a 45ms baseline latency. What is jitter and why does it matter?
Jitter — latency variance
Jitter = the variation in latency across measurements. It is a measure of consistency, not speed.
Example: Baseline latency of 45ms with 8ms jitter means response times typically range from ~37ms to ~53ms. This is relatively low jitter.
High jitter = poor user experience even if the average is acceptable:
Video calls become choppy
Real-time games feel inconsistent
Streaming has buffering pauses
API clients see unpredictable response times
Common jitter sources:
Network congestion and packet queuing
Garbage collection pauses (GC jitter in JVM applications)
CPU scheduling delays under load
Shared infrastructure resource contention
Vocabulary:
"Latency jitter is [X]ms — within acceptable bounds for this use case."
"High jitter is causing inconsistent response times — p99 is [X]x higher than p50."
"GC pause jitter is contributing to p99 outliers."
5 / 5
How would you complete this sentence professionally: "The new caching layer reduced p99 latency from 380ms to 85ms, ___"
Completing a benchmark result sentence professionally
The professional completion does four things:
States the relative improvement: 78% reduction — (380-85)/380 × 100
Connects to the SLO: "within our 100ms SLO threshold" — the reason this metric matters
States the business/reliability impact: "eliminating the primary cause of timeout errors" — the user-visible consequence
Specifies the context: "under peak load" — where the improvement was most needed
Why "really good" and "a lot faster" are insufficient in professional writing: They contain no quantification, no context, and no connection to why the result matters. In performance reviews, incident reports, and sprint demos, qualitative adjectives without numbers are not informative.
Model sentence patterns:
"[Metric] improved by [X%], from [A] to [B], bringing it within the [SLO/target]."
"This [X%] improvement eliminates [user-visible problem] under [condition]."
"After the optimisation, p99 is [X] — [X% / Nx] better than baseline."