Vocabulary for API Rate Limiting: 30 Terms Explained in Context

Rate limiting controls how many requests a client can make to an API in a given time. Every backend engineer talks about it, but the vocabulary trips up non-native speakers because many terms are metaphors (bucket, leak, burst) or look similar (throttle vs limit vs quota). This guide explains 30 essential terms in context, so you can both understand the docs and speak about rate limiting naturally.

The foundational terms

Term	Meaning	In a sentence
Rate limit	Max requests per time window	”The rate limit is 100 requests per minute.”
Throttle	To deliberately slow down requests	”We throttle clients that exceed the limit.”
Quota	Total allowance over a long period	”Your daily quota is 10,000 calls.”
Burst	A short spike of requests	”The client sent a burst of 500 at once.”
Backoff	Waiting longer between retries	”Use exponential backoff on 429s.”

The key distinction: a rate limit is requests per short window (per second/minute); a quota is the total over a long period (per day/month). Don’t use them interchangeably.

“You’re within your monthly quota, but you hit the per-second rate limit because you sent everything in one burst.”

Throttle vs limit vs reject

These three verbs describe different responses to too many requests:

Throttle — slow the client down (often by delaying responses)
Limit — cap the allowed rate
Reject / drop — refuse extra requests outright

“When you exceed the limit, we don’t drop your requests immediately — we throttle you first, then start rejecting if you keep pushing.”

The HTTP status code for “you’ve been rate limited” is 429 Too Many Requests. Engineers say “you got a 429” or “the API 429’d us.”

The algorithms (and their metaphors)

Rate-limiting algorithms borrow vivid metaphors. Knowing the imagery makes the vocabulary stick.

Algorithm	Metaphor	Plain meaning
Token bucket	A bucket fills with tokens; each request spends one	Allows bursts up to the bucket size
Leaky bucket	Requests drip out at a steady rate	Smooths traffic to a constant rate
Fixed window	A counter resets every minute	Simple but allows edge spikes
Sliding window	A rolling time window	Smoother than fixed window

“We use a token bucket because it tolerates short bursts — the bucket holds 100 tokens, refilling at 10 per second. A leaky bucket would smooth that out but reject bursts.”

Note the verbs: the bucket fills and refills; requests consume or spend tokens; when empty, requests are rejected until it refills.

Retry vocabulary

When a client gets rate-limited, it should retry intelligently. The vocabulary here is precise.

Term	Meaning
Retry	Try the request again
Backoff	Increase the wait between retries
Exponential backoff	Double the wait each time (1s, 2s, 4s…)
Jitter	Random variation added to backoff
Retry-After	A header telling you when to retry

“Respect the Retry-After header. If it’s missing, fall back to exponential backoff with jitter so all clients don’t retry at the same instant — that’s the thundering herd problem.”

Thundering herd describes many clients retrying simultaneously, overwhelming the server again. Jitter prevents it.

Server-side vocabulary

Term	Meaning
Per-key limit	Limit applied per API key
Per-IP limit	Limit applied per IP address
Global limit	A cap on total traffic
Soft limit	A warning threshold
Hard limit	An absolute cap, enforced strictly
Burst allowance	Extra headroom for short spikes

“We enforce a per-key rate limit with a small burst allowance, plus a global hard limit to protect the backend during traffic spikes.”

The contrast between soft limit (warns you) and hard limit (stops you) is worth memorising.

Phrases for discussing rate limits in meetings

“We’re getting rate limited by the third-party API.”
“Let’s back off and retry instead of hammering it.”
“We’re hitting the ceiling on the per-key quota.”
“Can we request a higher limit from the vendor?”
“The client isn’t respecting the Retry-After header.”

“Our integration keeps getting 429’d because we’re not backing off. We’re effectively hammering their API. Let’s add exponential backoff with jitter and respect the Retry-After header.”

The verb hammer (to send too many requests aggressively) is common and useful.

Common mistakes

Confusing quota and rate limit. Quota = long-term total; rate limit = short-term rate. They’re enforced separately.
Saying “limited” when you mean “throttled.” Being throttled means slowed; being limited/rejected means blocked.
Using “burst” as a verb wrongly. Say “a burst of requests” (noun) or “traffic bursts” — not “we bursted the API.”
Mispronouncing “quota.” It’s /ˈkwoʊtə/ — “KWOH-tuh,” not “koo-OH-ta.”
Forgetting “exponential.” It’s “exponential backoff,” not “exponentional” — practise the stress: ex-po-NEN-tial.

A mini-dialogue using the vocabulary

A: “Why are we getting 429s from the payments API?”

B: “We’re sending requests in a burst at the top of every hour. We blow through the token bucket instantly.”

A: “Can we smooth that out?”

B: “Yes — add jitter so we don’t all fire at once, and respect their Retry-After. Long term, we should ask for a higher rate limit, but we’re still well under our daily quota.”

Quick reference glossary

Idempotent retry — retrying safely without side effects
Circuit breaker — stops calling a failing service entirely for a while
Rate limiter — the component that enforces the limit
Window — the time period the limit applies to
Headroom — spare capacity below the limit
Cooldown — a forced wait before you can retry

Key takeaways

Rate limit = per short window; quota = long-term total. Keep them distinct.
Throttle (slow), limit (cap), reject/drop (refuse) describe different responses.
Learn the bucket metaphors: token bucket tolerates bursts, leaky bucket smooths traffic.
On 429s, use exponential backoff with jitter and respect Retry-After to avoid the thundering herd.

Master these 30 terms and you’ll read rate-limiting docs effortlessly — and sound precise when your team debugs that next wave of 429s.

Vocabulary for API Rate Limiting: 30 Terms Explained in Context

The foundational terms

Throttle vs limit vs reject

The algorithms (and their metaphors)

Retry vocabulary

Server-side vocabulary

Phrases for discussing rate limits in meetings

Common mistakes

A mini-dialogue using the vocabulary

Quick reference glossary

Key takeaways

What to Read Next

Practice This Vocabulary

IT Collocations Drills

Interview Preparation

IT Vocabulary Modules

The foundational terms

Throttle vs limit vs reject

The algorithms (and their metaphors)

Retry vocabulary

Server-side vocabulary

Phrases for discussing rate limits in meetings

Common mistakes

A mini-dialogue using the vocabulary

Quick reference glossary

Key takeaways

Related Articles

What to Read Next

Practice This Vocabulary

IT Collocations Drills

Interview Preparation

IT Vocabulary Modules