Learn the vocabulary of refilling a bucket of tokens at a steady rate to allow brief bursts above that rate.
0 / 5 completed
1 / 5
At standup, a dev mentions a rate-limiting scheme that refills a bucket with tokens at a steady rate, lets a request through only if a token is available to remove from the bucket, and allows the bucket to briefly hold a stockpile of unused tokens to absorb short bursts above the steady rate. What is this algorithm called?
Token bucket algorithm is exactly this: the token bucket algorithm refills a bucket with tokens at a steady rate, lets a request through only when a token is available to remove from the bucket, and allows unused tokens to accumulate up to a capped bucket size, letting a client briefly burst above the steady rate as long as it has stockpiled unused tokens. A hash collision is an unrelated hash-table concept about two keys sharing a bucket. This steady-refill-plus-stockpiled-burst-allowance is exactly why the token bucket algorithm is the standard building block behind most API rate limiters.
2 / 5
During a design review, the team uses the token bucket algorithm to rate-limit an API, specifically because allowing unused tokens to stockpile up to a capped bucket size lets a client burst briefly above the steady rate, instead of rejecting every request the instant it exceeds a strict per-moment cap. Which capability does this provide?
Token bucket algorithm here provides Tolerating short bursts above the steady rate without abandoning the overall limit, since tokens accumulate during quiet periods up to the bucket's capped capacity, so a client that was recently idle can briefly send a burst of requests by consuming its stockpiled tokens, before falling back to the steady refill rate once the stockpile runs out. A strict per-moment cap that rejects any request the instant a fixed rate is exceeded, even briefly, offers no room for a legitimate short burst at all. This stockpile-then-burst tolerance is exactly why the token bucket algorithm is favored over a strict per-moment cap for rate-limiting real client traffic.
3 / 5
In a code review, a dev notices an API rate-limiting feature rejects any request the instant a fixed per-second cap is exceeded, even for a single request arriving a moment early, instead of allowing a capped stockpile of unused capacity to absorb brief bursts. What does this represent?
This is a missed token-bucket opportunity, since using the token bucket algorithm's stockpiled-token approach would let a client briefly burst above the steady rate using accumulated tokens, instead of rejecting any request that exceeds a strict per-moment cap. A cache eviction policy is an unrelated concept about discarded cache entries. This strict-per-moment-cap pattern is exactly the kind of unnecessary rejection a reviewer flags once legitimate clients are known to send occasional short bursts.
4 / 5
An incident report shows an API rate limiter rejected a burst of legitimate requests from a client that had been idle moments before, because it enforced a strict per-moment cap with no way to absorb a brief burst using accumulated unused capacity. What practice would prevent this?
Switching to the token bucket algorithm's steady-refill-plus-stockpile approach removes the unnecessary rejection of legitimate bursts. Continuing to enforce a strict per-moment cap with no burst tolerance regardless of how often legitimate clients actually send short bursts is exactly what caused the issue described in this incident. This token-bucket approach is the standard fix once legitimate clients are confirmed to send occasional short bursts of otherwise-compliant traffic.
5 / 5
During a PR review, a teammate asks why the team reaches for the token bucket algorithm instead of the leaky bucket algorithm, given that the leaky bucket algorithm is also a classic rate-limiting scheme built around a bucket metaphor. What is the reasoning?
The token bucket algorithm lets a client stockpile unused tokens and briefly burst above the steady rate, while the leaky bucket algorithm processes queued requests at a strictly steady output rate regardless of how bursty the arrivals were, smoothing traffic out but never allowing that same burst tolerance. This is exactly why the token bucket algorithm is favored for limiting client request rates, while the leaky bucket algorithm is favored for producing a perfectly smoothed output stream.