Build fluency in the vocabulary of fixing a stale replica the moment a multi-replica read exposes the discrepancy.
0 / 5 completed
1 / 5
At standup, a dev mentions a distributed read that queries multiple replicas, notices one of them returned stale data, and pushes the newer value back to that stale replica right then, as part of handling the read itself. What is this technique called?
Read repair is exactly this: while handling a read that queries multiple replicas, if one of those replicas returns data that's older than what the others returned, the newer value is pushed back to that stale replica right then, fixing the discrepancy as a side effect of serving the read. A hash collision is an unrelated hash-table concept about two keys sharing a bucket. This fix-it-during-the-read behavior is exactly why read repair helps a distributed data store converge toward consistency without needing a dedicated background process for every discrepancy.
2 / 5
During a design review, the team relies on read repair specifically so a replica that missed an earlier write gets corrected the very next time any client happens to read through it. Which capability does this provide?
Read repair here provides opportunistic consistency healing driven by normal read traffic, since any read that happens to touch a stale replica also repairs it on the spot, meaning frequently-read data tends to self-heal quickly without waiting on a dedicated background process to eventually notice and fix that specific discrepancy. Leaving the stale replica uncorrected until a separate background process gets around to it would delay convergence for exactly the popular, frequently-accessed data where read repair's immediate fix matters most. This fix-during-the-read behavior is exactly why read repair complements, rather than replaces, a periodic background anti-entropy process.
3 / 5
In a code review, a dev notices a distributed read path queries multiple replicas, detects that one of them returned an older value than the others, and simply returns the newest value to the caller without pushing that newer value back to the stale replica. What does this represent?
This is a missed read-repair opportunity, since the read path has already done the hard part, detecting that one replica's value is stale compared to the others, and simply discarding that finding instead of pushing the newer value back leaves the discrepancy in place for the very next read to rediscover all over again. A cache eviction policy is an unrelated concept about discarded cache entries. This detect-but-don't-fix pattern is exactly the kind of missed opportunity a reviewer flags once the newer value is already sitting right there in the read path.
4 / 5
An incident report shows a distributed data store kept re-detecting the same stale replica on every read for a popular key over many hours, because the read path noticed the discrepancy each time but never pushed the corrected value back to that replica. What practice would prevent this?
Adding read repair, so the newer value gets pushed back to the stale replica the moment it's detected during a read, stops the same discrepancy from being rediscovered over and over, which is exactly the fix for the repeated re-detection described in this incident. Continuing to detect but not correct the stale replica on every read regardless of how often it resurfaces is exactly what let the same problem repeat for hours. This detect-and-fix-in-place pattern is the standard, low-overhead way to heal a stale replica the moment normal read traffic happens to expose it.
5 / 5
During a PR review, a teammate asks why the team relies on read repair in addition to a periodic anti-entropy background process, given that anti-entropy already compares and repairs replicas on its own schedule. What is the reasoning?
Read repair fixes a stale replica the instant normal read traffic exposes the discrepancy, which can heal popular, frequently-accessed data well before anti-entropy's next scheduled comparison pass gets around to it, while anti-entropy still provides the safety net of eventually catching and repairing data that's rarely or never read directly. The tradeoff is that read repair only ever catches discrepancies that reads happen to expose, leaving rarely-accessed data dependent on anti-entropy alone. This is exactly why the two techniques are commonly used together, combining read repair's fast, traffic-driven healing with anti-entropy's thorough, scheduled coverage.