Learn the vocabulary of refilling a bucket of tokens at a steady rate to allow bursts while capping the long-run average rate.
0 / 5 completed
1 / 5
A teammate explains that a rate limiter maintains a bucket that refills with tokens at a steady rate up to a fixed capacity, and each incoming request must consume one token to proceed, so short bursts up to the bucket's capacity are allowed while the long-run average rate stays capped. What rate-limiting algorithm is being described?
The token bucket algorithm is exactly this: a bucket refills with tokens at a steady rate up to a fixed capacity, and each incoming request must consume one token to proceed, so requests are rejected or delayed once the bucket is empty, which allows short bursts up to the bucket's capacity while still capping the long-run average rate to the refill rate. A DNS zone transfer is an unrelated concept about replicating name server records. This refill-at-a-steady-rate-with-burst-capacity approach is exactly why the token bucket algorithm is favored over rate limiters that reject any burst at all.
2 / 5
During a design review, the team adopts a token bucket rate limiter for a public API, specifically so a client can burst up to fifty requests instantly if it has been idle, while still being capped to ten requests per second on average over time. Which capability does this provide?
A token bucket rate limiter here provides burst tolerance combined with a capped long-run average rate, since accumulated tokens from an idle period allow an instant burst while the steady refill rate still bounds the average over time. Rejecting any request beyond a strict fixed number per second with zero burst allowance would reject a legitimate client that had simply been idle and is now catching up on a batch of queued work. This accumulate-then-burst-but-cap-the-average behavior is exactly why the token bucket algorithm is favored for public APIs that expect bursty but bounded traffic.
3 / 5
In a code review, a dev notices a rate limiter rejects any request the instant a fixed per-second count is exceeded, with no ability for an idle client to accumulate capacity for a later burst, instead of using a token bucket that allows accumulated tokens to be spent in a burst. What does this represent?
This is a missed token-bucket opportunity, since accumulating tokens during idle periods would let a legitimate client burst afterward instead of being rejected the instant a strict per-second count is exceeded. A cache eviction policy is an unrelated concept about discarded cache entries. This no-burst-accumulation pattern is exactly the kind of overly rigid limiting a reviewer flags once legitimate bursty clients are expected.
4 / 5
An incident report shows a legitimate client that had been idle for an hour and then sent a queued batch of thirty requests had most of those requests rejected, because the rate limiter enforced a strict fixed per-second cap with no way to accumulate unused capacity. What practice would prevent this?
Switching to a token bucket rate limiter lets unused capacity accumulate as tokens during idle periods and be spent on a legitimate burst afterward. Continuing to enforce a strict fixed per-second cap with no accumulation regardless of how many legitimate bursty clients get rejected is exactly what caused the incident described here. This accumulate-then-burst approach is the standard fix once strict per-second caps are confirmed to reject legitimate bursty traffic.
5 / 5
During a PR review, a teammate asks why the team reaches for a token bucket instead of a strict fixed-window counter that simply rejects everything past a hard per-second limit, given that a fixed-window counter is simpler to implement. What is the reasoning?
A token bucket trades a small amount of extra bookkeeping tracking accumulated tokens for tolerating legitimate bursts while still capping the long-run average, while a fixed-window counter is simpler but rejects any burst at all, even from a client that has been well under its average rate. This is exactly why a token bucket is favored for APIs expecting legitimate bursty traffic, while a fixed-window counter remains acceptable when traffic is expected to be smooth and bursts should always be rejected.