Performance · English usage comparison

Latency vs Throughput: English Usage Guide for IT Professionals

Latency measures how long one request takes; throughput measures how many requests a system handles per second. A system can have low latency but low throughput (fast for one user, slow under load) or high throughput but high latency (processes many requests, each slowly). Both matter — but for different reasons.

Side-by-side comparison

Aspect Latency Throughput
What it measures Time for one operation (ms) Operations per unit of time (req/s)
Analogy How fast one car travels How many cars pass a toll booth per hour
Optimised by Caching, reducing hops, faster code Parallelism, horizontal scaling, queues
Key metric form p50, p95, p99 in milliseconds Requests per second (RPS), TPS

Example sentences

Latency

  • "Our p99 latency is 450 ms — the slowest 1% of requests take nearly half a second."
  • "Adding a Redis cache cut database latency from 80 ms to under 2 ms."

Throughput

  • "After horizontal scaling, throughput increased from 500 to 3,000 requests per second."
  • "The message queue increases throughput by letting workers process jobs in parallel."

Exercises: choose the correct English usage

Select the best answer for each question, then check your reasoning.

1. A user complains the page "takes forever to load". They are describing high ___.

2. An engineer says "we need to handle 10,000 requests per second." They are talking about ___.

3. Which metric uses "p99"?

4. Which sentence uses "throughput" correctly?

5. True or false: reducing latency always increases throughput.

Frequently asked questions

What is the difference between latency and response time?

"Response time" usually means the total time from request to response, including processing. "Latency" often refers specifically to the network delay component, though in practice the terms are used interchangeably.

What does p99 mean?

"p99 latency" is the 99th-percentile latency — 99% of requests are faster than this value. It captures the tail experience without being skewed by occasional outliers.

How is throughput measured?

Requests per second (RPS), transactions per second (TPS), or messages per second (MPS), depending on the system.