Implement token bucket, sliding window log, Redis-based atomic rate limiting, standard rate limit headers, and distributed coordination strategies.
0 / 5 completed
1 / 5
What is the token bucket algorithm for rate limiting?
Token bucket: with a capacity of 100 and refill rate of 10/second, a burst of 100 requests can be served immediately (consuming all tokens), then only 10 more per second are allowed. This allows short bursts without rejecting requests as long as the long-term average stays within the refill rate.
2 / 5
What is the sliding window log rate limiting algorithm and its trade-off versus fixed windows?
Sliding window log: fixed windows (count resets every minute) allow a burst at the boundary: 100 requests at 0:59 and 100 more at 1:00 — 200 in two seconds. The sliding window prevents this by looking at actual timestamps. The trade-off is O(n) memory per user, which becomes expensive at millions of requests per minute.
3 / 5
How does Redis-based distributed rate limiting handle concurrent requests atomically?
Redis atomicity: if two nodes read the same counter simultaneously, both see "99 requests" and both allow the 100th, exceeding the limit. A Lua script bundles the read-increment-check in one atomic operation, so only one node's request succeeds. The Sliding Window Counter algorithm uses ZADD + ZCARD in a Lua script for per-user tracking.
4 / 5
What information do rate limit response headers convey and what is the standard format?
Rate limit headers: a well-behaved API client reads RateLimit-Remaining: 5 and slows down before hitting zero, avoiding 429 errors entirely. Retry-After in the 429 response tells the client how long to wait. The IETF RateLimit Headers specification (RFC 6585 extension) is being adopted by major APIs like GitHub and Stripe.
5 / 5
What challenge does distributed rate limiting face that single-node rate limiting does not?
Distributed coordination: solutions include centralised counters (Redis), eventually-consistent local counters with gossip sync, or sticky load balancing (same user always hits same node). Each trades accuracy for latency — centralised Redis adds a network round-trip to every request; local counters allow temporary limit overruns in exchange for no coordination overhead.