Distributed Systems Consensus
Frequently Asked Questions
What does the CAP theorem state and how do engineers use it in design discussions?
The CAP theorem states that a distributed system can guarantee at most two of three properties simultaneously: Consistency (every read receives the most recent write), Availability (every request receives a response), and Partition Tolerance (the system continues operating despite network partitions). Engineers use it to frame trade-off decisions, saying a system is "CP" (consistent under partitions but may become unavailable) or "AP" (always available but may serve stale data).
How do engineers explain the Raft consensus algorithm in technical discussions?
Raft is a consensus algorithm designed to be more understandable than Paxos, using leader election, log replication, and term numbers to achieve agreement across a cluster. In discussions, engineers describe Raft by saying "a leader is elected each term by a majority vote, and all writes go through the leader, which replicates them to a quorum before committing". Systems like etcd and CockroachDB use Raft internally.
What is "split-brain" in distributed systems and how is it communicated?
Split-brain occurs when a network partition causes two parts of a cluster to each believe they are the active primary, potentially leading to conflicting writes and data divergence. Engineers prevent it using quorum-based fencing — requiring a majority (quorum) of nodes to agree before any node can act as leader — and describe split-brain resolution strategies using terms like "STONITH" (Shoot The Other Node In The Head) for forcibly evicting a node.
What is linearisability and how does it differ from eventual consistency?
Linearisability (also called strong consistency) guarantees that operations appear to take effect instantaneously at a single point in time, making a distributed system behave as if it were a single machine. Eventual consistency only guarantees that, absent new writes, all replicas will converge — allowing reads to return stale values in the interim. Engineers describe linearisability as "the strongest consistency model" and note it comes at a latency and availability cost.
How do engineers discuss distributed locks and their failure modes?
A distributed lock coordinates exclusive access to a shared resource across multiple processes or nodes. Engineers implement them using systems like Redis (Redlock algorithm) or ZooKeeper and discuss failure modes such as "lock expiry during a slow process leading to two holders simultaneously" or "a network partition causing a holder to be unable to renew its lease". The vocabulary includes "fencing tokens" — monotonically increasing numbers used to detect and reject stale lock holders.
What is two-phase commit (2PC) and what are its limitations?
Two-phase commit is a distributed transaction protocol where a coordinator first asks all participants to prepare (vote to commit or abort), then instructs all to commit or abort based on the votes. Engineers describe its key limitation as the "blocking problem" — if the coordinator fails after the prepare phase but before sending the commit decision, participants are left holding locks indefinitely, unable to progress without manual intervention or a timeout.
What does "quorum" mean in distributed consensus and how is it calculated?
A quorum is the minimum number of nodes that must agree for a distributed operation to succeed, typically a simple majority (floor(n/2) + 1 nodes in a cluster of n). Engineers use it to ensure that any two quorums overlap by at least one node, guaranteeing that a committed decision is visible to any subsequent quorum read. They describe systems as "requiring quorum writes" to prevent data loss during node failures.
How do engineers explain Lamport timestamps and vector clocks?
Lamport timestamps assign a logical counter to each event, incremented on send and set to max(local, received)+1 on receipt, providing a partial ordering of events across nodes. Vector clocks extend this to a vector of counters (one per node), enabling detection of concurrent events and causal relationships. Engineers say two events are "causally related" if one vector clock dominates the other, and "concurrent" if neither dominates.
What is the PACELC theorem and how does it extend CAP?
PACELC extends the CAP theorem to address the latency/consistency trade-off that exists even when there is no partition. It states that during a Partition (P), a system chooses between Availability and Consistency (AC); Else (E), even without a partition, a system must choose between Latency and Consistency (LC). Engineers use PACELC to have more nuanced discussions about system behaviour under normal operating conditions, not just failure scenarios.
How do engineers discuss consistency levels in distributed databases like Cassandra?
Distributed databases like Apache Cassandra allow per-query consistency levels that trade latency against staleness guarantees. Engineers use terms like QUORUM (majority of replicas must respond), ONE (any single replica), and ALL (all replicas must respond) to describe the read/write consistency level. A common pattern is "write at QUORUM, read at QUORUM" to ensure that at least one replica in every read/write pair overlaps, achieving strong consistency without requiring ALL.