Build fluency in the vocabulary of a cluster agreeing on exactly one coordinator node.
0 / 5 completed
1 / 5
At standup, a dev mentions a cluster of interchangeable nodes using a consensus protocol to agree on exactly one of themselves as the coordinator for a task, rather than a person designating one manually. What is this process called?
Leader election is a cluster of interchangeable nodes using a consensus protocol to agree on exactly one of themselves as the coordinator for a task, so the rest of the cluster can defer to that one node without ambiguity. Load balancing distributes traffic across multiple nodes rather than requiring the cluster to agree on a single coordinator. This election process is a foundational building block for any distributed system that needs exactly one node in charge of a specific responsibility at a time.
2 / 5
During a design review, the team wants an elected leader's authority to automatically expire and hand off to another node if the leader stops renewing it, instead of the leadership lasting forever once granted. Which capability supports this?
A time-bound lease that the leader must periodically renew ensures leadership automatically expires and can hand off to another node if the current leader stops renewing it, for instance because it crashed or became partitioned. Granting leadership permanently once elected leaves the cluster stuck if that leader ever becomes unreachable without failing over cleanly. This lease-based renewal is what lets a cluster recover from a failed leader without requiring a person to intervene manually.
3 / 5
In a code review, a dev notices followers tracking a leader's periodic heartbeat messages and triggering a new election once a heartbeat is missed for longer than a configured timeout. What does this represent?
Heartbeat-based failure detection has followers track a leader's periodic heartbeat messages and trigger a new election once a heartbeat is missed for longer than a configured timeout, giving the cluster a way to notice a failed leader without a person watching. A read replica is an unrelated concept about serving read traffic rather than detecting a leader's failure. This heartbeat mechanism is what makes leader election a genuinely automated, self-healing process rather than a one-time decision.
4 / 5
An incident report shows two nodes both believed themselves to be the leader and both accepted writes simultaneously after a network partition, because the old leader kept operating with no fencing token proving its leadership had actually been revoked. What practice would prevent this?
Requiring every write to include a monotonically increasing fencing token lets downstream systems reject a stale leader's writes once a newer leader has been elected and issued a higher token, closing the split-brain window a bare lease timeout alone can leave open. Continuing to trust a leader's writes with no fencing token is exactly what let two nodes both accept writes simultaneously in this incident. This fencing mechanism is a standard defense against the split-brain scenario a network partition can otherwise trigger.
5 / 5
During a PR review, a teammate asks why the team relies on a consensus-based leader election protocol instead of just designating one node as the leader in a config file and trusting it to stay up. What is the reasoning?
A statically designated leader in a config file has no automated way to fail over if that specific node goes down, leaving the cluster without a functioning coordinator until a person manually updates the config. A consensus-based election automatically detects the failure, through a missed heartbeat or an expired lease, and elects a replacement without requiring manual intervention. The tradeoff is the added complexity of running an actual consensus protocol instead of just hardcoding one node's identity.