Learn the vocabulary of many clients hitting a shared resource at the exact same moment.
0 / 5 completed
1 / 5
At standup, a dev mentions that the moment a single popular cache entry expires, thousands of waiting requests all wake up at the exact same instant and hit the origin database simultaneously, overwhelming it far beyond its normal load. What is this problem called?
A thundering herd is exactly this: it describes a large number of clients or processes all waking up and hitting a shared resource at exactly the same moment, such as when a popular cache entry expires or a lock is released, overwhelming that resource far beyond its normal load. A hash collision is an unrelated hash-table concept about two keys sharing a bucket. This simultaneous-mass-wakeup-overwhelms-a-resource pattern is exactly why a thundering herd can take down a backend that would otherwise handle the same total traffic just fine if it arrived spread out over time.
2 / 5
During a design review, the team adds jittered expiration times and a single-flight lock around cache refills, specifically because staggering when entries expire and letting only one request refill a given key at a time prevents thousands of requests from hitting the origin database simultaneously. Which capability does this provide?
Jittered expiration and a single-flight refill lock here provide protection against a simultaneous mass wakeup overwhelming the origin, since jittered expiration and a single-flight refill lock stagger requests and let only one refill the origin at a time, instead of every waiting request hitting it the instant a cache entry expires. Letting every cache entry expire at the same fixed time with no refill lock means every one of those waiting requests hits the origin simultaneously the moment the entry expires. This stagger-and-single-flight behavior is exactly why jittering and refill locks are the standard defense against a thundering herd.
3 / 5
In a code review, a dev notices a popular cache entry is set to expire at a fixed, identical time with no jitter, and nothing prevents every one of the thousands of requests waiting on that key from independently querying the origin database the instant it expires. What does this represent?
This is a missed opportunity to prevent a thundering herd, since jittering the expiration time and adding a single-flight refill lock would stagger requests instead of letting thousands hit the origin simultaneously the instant the entry expires. A cache eviction policy is an unrelated concept about discarded cache entries. This fixed-expiration-no-refill-lock pattern is exactly the kind of risk a reviewer flags once a cache entry is popular enough to have many requests waiting on it.
4 / 5
An incident report shows the origin database was overwhelmed and became unresponsive for several minutes, because a popular cache entry expired at a fixed time with no jitter, and thousands of waiting requests independently queried the origin at the exact same instant with no refill lock to coordinate them. What practice would prevent this?
Adding jittered expiration times and a single-flight refill lock staggers requests and lets only one refill the origin at a time instead of thousands hitting it simultaneously the moment the entry expires. Continuing to expire the cache entry at a fixed time with no jitter and no refill lock regardless of how often the origin gets overwhelmed by the simultaneous requests is exactly what caused the outage described in this incident. This jitter-and-single-flight approach is the standard fix once a thundering herd is confirmed to overwhelm the origin.
5 / 5
During a PR review, a teammate asks why the team adds jitter and a single-flight refill lock instead of simply giving the origin database much larger capacity so it can absorb the simultaneous spike whenever a popular cache entry expires. What is the reasoning?
Jitter and a single-flight lock prevent the simultaneous spike from ever happening in the first place at a fraction of the cost, while over-provisioning the origin's capacity to absorb a full simultaneous spike means paying for that extra capacity around the clock even though the spike itself only happens in the brief moment after a popular entry expires. This is exactly why jittering and single-flight locking are the standard defense against a thundering herd, rather than simply over-provisioning capacity to absorb it.