English for CockroachDB Developers
Learn the English vocabulary for CockroachDB: ranges, leaseholders, distributed transactions, and the terms for discussing a distributed SQL database.
CockroachDB is SQL on top of a genuinely distributed storage layer, and the terms that describe how data is physically spread across nodes — ranges, leaseholders, replicas — matter a lot more day-to-day than they would with a single-node Postgres instance. This guide covers that vocabulary.
Key Vocabulary
Range — the unit CockroachDB splits table data into (roughly 512MB by default), each independently replicated and moved across nodes, the fundamental building block of horizontal scaling. “As that table grew, it automatically split into more ranges — each range gets balanced across the cluster independently, so we didn’t need to manually shard anything.”
Leaseholder — the single replica of a range currently authorized to serve reads and coordinate writes for that range, which can move between nodes as the cluster rebalances. “Query latency spiked because the leaseholder for that range was on a node in a different region than most of our traffic — we pinned it closer with a zone configuration.”
Replica — a copy of a range stored on a specific node; by default CockroachDB keeps three replicas of every range, tolerating the loss of one node without data loss. “With three replicas per range, losing a single node doesn’t cause downtime — the cluster just elects a new leaseholder from the surviving replicas.”
Distributed transaction — a transaction that touches ranges spread across multiple nodes, coordinated through a two-phase commit-like protocol to maintain full ACID guarantees despite the data being physically distributed. “That transfer function touches accounts that could live on ranges across different nodes, but CockroachDB still guarantees it commits atomically as a distributed transaction.”
Zone configuration — the policy controlling how and where a table’s or database’s ranges are replicated and placed, used to pin data to specific regions for latency or compliance reasons. “We set a zone configuration to keep EU customer data’s ranges within EU-region nodes only, satisfying the data residency requirement without a separate database.”
Common Phrases
- “Is this latency coming from a leaseholder that’s far from where the query originates?”
- “How many replicas does this range have, and can we tolerate a node failure without downtime?”
- “Does this transaction span multiple ranges, and is that adding coordination overhead?”
- “Is there a zone configuration pinning this data to a specific region?”
- “Has this table split into multiple ranges yet, or is it still small enough to be one?”
Example Sentences
Diagnosing a latency issue in a review: “The leaseholder for the hot range was on a node in a different datacenter from the app servers — moving it closer cut p99 latency by more than half.”
Explaining fault tolerance to a stakeholder: “Every range keeps three replicas across different nodes, so losing one machine doesn’t cause an outage — the cluster automatically promotes a surviving replica to leaseholder.”
Discussing data residency in a compliance review: “We used a zone configuration to constrain EU customer ranges to EU nodes specifically, which satisfies residency requirements without standing up a separate regional database.”
Professional Tips
- Reference leaseholder location specifically when debugging distributed latency — “the database is slow” is far less actionable than “the leaseholder for this range is in the wrong region.”
- State the replica count and placement when discussing fault tolerance in a design review — the default of three isn’t universal, and a critical table might warrant a different policy.
- Flag when an operation is a distributed transaction spanning multiple ranges — it’s still ACID, but it carries more coordination overhead than a single-range transaction, worth knowing for performance-sensitive code paths.
- Use zone configuration as the term when proposing data placement for latency or compliance reasons — it’s the actual mechanism, more precise than saying “we’ll put it in the right region.”
Practice Exercise
- Write a sentence explaining what a leaseholder is and why its location matters.
- Describe how replicas provide fault tolerance in your own words.
- Explain when a transaction becomes a distributed transaction.