Build fluency in the vocabulary of determining causal order between events in a distributed system.
0 / 5 completed
1 / 5
At standup, a dev mentions a data structure made of per-node counters attached to an event, used to determine whether one event causally happened before another or whether the two are actually concurrent. What is this structure called?
A vector clock is a data structure made of per-node counters attached to an event, used to determine whether one event causally happened before another or whether the two are actually concurrent. A single wall-clock timestamp recorded by whichever node processed the event can't reliably capture true causality across independent, imperfectly synchronized nodes. This per-node counter structure is what lets a distributed system reason about causal order directly rather than relying on an unsynchronized clock.
2 / 5
During a design review, the team wants to compare two events' vector clocks to determine whether they represent a genuine conflict, meaning neither happened before the other, or a clear causal order. Which capability supports this?
Vector clock comparison determines whether two events represent a genuine conflict, meaning neither one causally happened before the other, or whether they have a clear causal order, by comparing their per-node counter values directly. Comparing only a wall-clock timestamp can't distinguish a true conflict from a clear causal order, since two independent nodes' clocks are never perfectly synchronized. This comparison capability is what makes a vector clock useful for a system that needs to detect and resolve a genuine concurrent update.
3 / 5
In a code review, a dev notices the vector clock's size grows with the number of nodes that have ever participated in the system, requiring a pruning or garbage-collection step for a long-lived deployment. What does this represent?
Vector clock growth requiring pruning describes how the structure's size grows with the number of nodes that have ever participated in the system, since it needs one counter per node, eventually needing a garbage-collection step to stay manageable in a long-lived deployment. Assuming it stays a fixed, small size ignores that a system with many nodes joining and leaving over time accumulates an ever-larger vector clock. This growth and its associated pruning are a genuine, practical operational concern for a system relying on vector clocks long-term.
4 / 5
An incident report shows two concurrent writes to the same record went undetected as conflicting because the system compared plain wall-clock timestamps instead of a vector clock, and the wrong version was silently kept as the latest. What practice would prevent this?
Using a vector clock to compare events correctly detects a genuine concurrent conflict, since it captures true causality directly rather than depending on an unsynchronized wall-clock timestamp. Comparing only timestamps risks exactly the silent, wrong resolution this incident describes, since two independent nodes' clocks can disagree about which event actually happened first. This causally aware comparison is essential wherever two nodes might genuinely write to the same data concurrently.
5 / 5
During a PR review, a teammate asks why the team uses a vector clock instead of a simple wall-clock timestamp to order events across a distributed system. What is the reasoning?
Wall-clock timestamps across independent nodes aren't perfectly synchronized and can't reliably distinguish a genuine causal order from a truly concurrent conflict between two events. A vector clock captures true causality directly through its per-node counters, without depending on clock synchronization at all. The tradeoff is the vector clock's own growing size as more nodes participate, requiring an eventual pruning step to stay manageable.