Rate limiting: caps how many requests a client can make in a time window, preventing individual clients from exhausting shared resources and ensuring availability for all users.
2 / 5
How does a token bucket algorithm work?
Token bucket: allows bursting up to N requests (the full bucket), then sustains a steady rate equal to the refill rate. This accommodates bursty traffic better than fixed-window approaches.
3 / 5
What does a leaky bucket algorithm enforce?
Leaky bucket: requests enter a queue and are processed at a constant rate (the leak). Excess requests are queued (up to a limit) or dropped. This enforces a smooth output rate regardless of input bursts.
4 / 5
What HTTP status code is conventionally returned when a client is rate-limited?
429: RFC 6585 defines 429 Too Many Requests for rate limiting. The response typically includes a Retry-After header indicating when the client may try again.
5 / 5
What is the difference between rate limiting and throttling?
Rate limiting vs throttling: rate limiting typically rejects requests above the limit immediately (429). Throttling degrades service gracefully by slowing or queuing excess requests rather than immediately rejecting them.