Learn the vocabulary of a background process comparing and repairing replicas that have drifted out of sync.
0 / 5 completed
1 / 5
At standup, a dev mentions a background process that periodically compares replicas of the same data against each other and repairs any differences it finds, so replicas that drifted out of sync during a network issue gradually converge back to consistency. What is this process called?
Anti-entropy, often implemented as a background repair process, periodically compares replicas of the same data against each other and repairs any differences it finds, letting replicas that drifted out of sync during a network partition or dropped write gradually converge back to consistency over time. A hash collision is an unrelated hash-table concept about two keys sharing a bucket. This periodic compare-and-repair behavior is exactly why a distributed data store can recover from transient inconsistency without requiring every single write to be perfectly synchronized in real time.
2 / 5
During a design review, the team relies on anti-entropy specifically so replicas that missed some writes during a network partition eventually catch up and converge, even without a dedicated real-time repair for that particular missed write. Which capability does this provide?
Anti-entropy here provides eventual convergence of replicas after a transient inconsistency, since its periodic comparison-and-repair cycle catches differences that arose from any cause, including a write missed during a network partition, and repairs them even without a dedicated real-time mechanism for that specific missed write. Ensuring every write is applied synchronously to every replica in real time would avoid the inconsistency in the first place but at a much higher cost to availability and latency. This eventual, background-repair behavior is exactly why anti-entropy is a standard complement to a distributed store's write path.
3 / 5
In a code review, a dev notices a distributed data store has no background process at all for comparing and repairing replicas that may have drifted out of sync during a network partition, relying entirely on the write path to keep everything consistent. What does this represent?
This is a missed anti-entropy opportunity, since relying entirely on the write path to keep replicas consistent leaves no safety net for the write that gets missed during a network partition or other transient failure, while a periodic background compare-and-repair process would catch and fix exactly that kind of drift over time. A cache eviction policy is an unrelated concept about discarded cache entries. This no-background-repair pattern is exactly the kind of gap a reviewer flags once network partitions and dropped writes are recognized as a realistic operational risk.
4 / 5
An incident report shows two replicas of the same data silently diverged for days after a network partition, because the store relied entirely on its write path for consistency with no background process comparing and repairing replicas afterward. What practice would prevent this?
Running a periodic anti-entropy process that compares replicas and repairs any differences it finds catches drift from a network partition and fixes it over time, which is exactly the fix for the silent divergence described in this incident. Continuing to rely entirely on the write path with no background comparison or repair is exactly what let the replicas silently diverge for days. This periodic compare-and-repair process is the standard fix for catching and healing replica drift after a partition or other transient failure in a distributed data store.
5 / 5
During a PR review, a teammate asks why the team runs a periodic anti-entropy process instead of just making every write synchronously update every replica before acknowledging success, which would avoid drift entirely. What is the reasoning?
Synchronously updating every replica on every single write sacrifices availability and latency the moment any one replica is slow or temporarily unreachable, since the write can't be acknowledged until all replicas confirm it, while anti-entropy accepts writes more readily against whichever replicas are reachable and heals any resulting drift later in the background. The tradeoff is that anti-entropy allows a window of temporary inconsistency between replicas before repair catches up. This is exactly why anti-entropy is favored in systems that prioritize availability, accepting eventual rather than immediate consistency across replicas.