Practice the vocabulary of distributing keys across nodes with minimal remapping on scale changes.
0 / 5 completed
1 / 5
At standup, a dev mentions a hashing scheme for distributing data across cache or database nodes, designed so adding or removing a node only remaps a small fraction of keys instead of nearly all of them. What is this hashing scheme called?
Consistent hashing distributes data across cache or database nodes using a scheme designed so adding or removing a node only remaps a small fraction of keys, rather than the near-total remapping a simple modulo-based hash would cause whenever the node count changes. A simple modulo hash, like key mod number-of-nodes, reassigns almost every key to a different node the moment that node count changes at all. Consistent hashing avoids that costly mass remapping and the resulting cache misses or data movement it would otherwise trigger.
2 / 5
During a design review, the team wants each physical node to be represented by multiple points on the hash ring, so load distributes more evenly even when the number of nodes is small. Which capability supports this?
Virtual nodes on the consistent hash ring represent each physical node with several points scattered around the ring rather than just one, spreading out that node's share of the key space more evenly and reducing the chance of an unevenly loaded physical node when the total node count is small. Representing each physical node with exactly one fixed point risks a lopsided distribution, since a single point's position on the ring is essentially random relative to the others. This virtual-node technique meaningfully improves consistent hashing's load-balancing behavior in practice.
3 / 5
In a code review, a dev notices the system reads a small number of consecutive nodes clockwise from a key's hash position on the ring, so a lookup can still succeed even if the immediately closest node is currently down. What does this represent?
Ring-based replication to the next N nodes clockwise from a key's hash position gives a lookup somewhere to fall back to if the single closest node happens to be down, rather than that lookup failing outright. Reading only from the single closest node with no fallback makes the whole system fragile to any individual node failure. This replication-to-the-next-few-nodes pattern is a natural extension of consistent hashing that adds real fault tolerance on top of its load-distribution benefit.
4 / 5
An incident report shows a cache cluster using a simple modulo-based hash suffered a massive wave of cache misses after a single node was added, since nearly every key's assigned node changed at once. What practice would prevent this?
Using consistent hashing ensures adding or removing a node only remaps a small fraction of keys, avoiding the massive cache-miss wave a simple modulo-based hash causes when nearly every key's assigned node changes at once. Continuing to use a modulo-based hash and accepting that mass remapping as unavoidable ignores that consistent hashing was specifically designed to solve exactly this problem. This is precisely why consistent hashing became the standard approach for a distributed cache or database that needs to scale its node count elastically.
5 / 5
During a PR review, a teammate asks why the team uses consistent hashing instead of a simple modulo-based hash to distribute keys across cache nodes. What is the reasoning?
A simple modulo-based hash remaps nearly every key to a different node the moment the node count changes at all, causing a large wave of cache misses or unnecessary data movement. Consistent hashing only remaps a small fraction of keys in that same scenario, making it far better suited to a cache or database cluster that scales its node count up or down over time. The tradeoff is the added implementation complexity of the hash ring and virtual-node structure compared to a much simpler modulo calculation.