Consensus Algorithms
5 exercises — master consensus algorithm vocabulary: Raft vs Paxos, leader election, log replication, quorum calculation, log compaction via snapshots, and the pre-vote extension.
0 / 5 completed
Consensus algorithm quick reference
- Raft — consensus via leader election + log replication + safety. Designed for understandability.
- Multi-Paxos — equivalent guarantees, but less prescriptive implementation. Harder to implement correctly.
- Term — Raft's monotonic epoch; a new term begins with each election attempt.
- Quorum = ⌊n/2⌋ + 1. For 5 nodes: quorum = 3, fault tolerance = 2.
- Commit — a log entry is committed when replicated to a quorum of nodes.
- Log compaction — snapshotting state machine; discards already-applied log entries.
- Pre-vote — partitioned nodes check election viability before incrementing the term (prevents term inflation).
1 / 5
A senior engineer says: "We're using Raft for leader election and log replication in our key-value store. Why Raft specifically, not Paxos?"
What's the key difference?
Raft = same guarantees as Paxos, but explicitly designed to be understandable and straightforward to implement correctly.
What consensus algorithms solve:
A set of nodes must agree on a single value (or a sequence of values — a log) even if some nodes crash. They must satisfy:
• Safety — all nodes that decide, decide the same value
• Liveness — given enough time and message delivery, a decision is eventually reached
• Both Raft and Paxos tolerate up to ⌊(n-1)/2⌋ crash-fail faults in a cluster of n nodes
Raft's three sub-problems:
Raft key concepts:
• Term — monotonically increasing epoch number; each election starts a new term
• Log index + term — uniquely identifies a log entry; used to detect log inconsistencies
• Commit index — the highest log entry acknowledged by a quorum; safe to apply to the state machine
• Leader completeness property — a leader always has all committed entries; followers' logs can lag
Why Paxos is harder:
• Multi-Paxos (for log replication) requires many implementation choices not specified by the paper
• Leader "distinguished proposer" election is underspecified
• Handling log gaps and leadership recovery is left as an exercise
• Real-world Paxos: Chubby (Google), Zookeeper ZAB (Paxos-like), Spanner
Real-world Raft: etcd (Kubernetes), CockroachDB, TiKV, Consul
Key vocabulary:
• Raft — consensus protocol decomposed into leader election, log replication, and safety; designed for understandability
• Multi-Paxos — extension of Paxos for replicated logs; correct but leaves many design choices unspecified
• Term — Raft's epoch number; a new term begins with each election attempt
• Quorum — majority of nodes (⌊n/2⌋ + 1); required for election and commit
What consensus algorithms solve:
A set of nodes must agree on a single value (or a sequence of values — a log) even if some nodes crash. They must satisfy:
• Safety — all nodes that decide, decide the same value
• Liveness — given enough time and message delivery, a decision is eventually reached
• Both Raft and Paxos tolerate up to ⌊(n-1)/2⌋ crash-fail faults in a cluster of n nodes
Raft's three sub-problems:
| Sub-problem | Mechanism |
|---|---|
| Leader election | Randomised election timeouts → candidates request votes from quorum → term-based leadership |
| Log replication | Leader accepts writes → appends to log → replicates to followers → commits when majority acknowledge |
| Safety | Election restriction: candidates must have up-to-date log; only one leader per term |
Raft key concepts:
• Term — monotonically increasing epoch number; each election starts a new term
• Log index + term — uniquely identifies a log entry; used to detect log inconsistencies
• Commit index — the highest log entry acknowledged by a quorum; safe to apply to the state machine
• Leader completeness property — a leader always has all committed entries; followers' logs can lag
Why Paxos is harder:
• Multi-Paxos (for log replication) requires many implementation choices not specified by the paper
• Leader "distinguished proposer" election is underspecified
• Handling log gaps and leadership recovery is left as an exercise
• Real-world Paxos: Chubby (Google), Zookeeper ZAB (Paxos-like), Spanner
Real-world Raft: etcd (Kubernetes), CockroachDB, TiKV, Consul
Key vocabulary:
• Raft — consensus protocol decomposed into leader election, log replication, and safety; designed for understandability
• Multi-Paxos — extension of Paxos for replicated logs; correct but leaves many design choices unspecified
• Term — Raft's epoch number; a new term begins with each election attempt
• Quorum — majority of nodes (⌊n/2⌋ + 1); required for election and commit