What is a shard key, and why does choosing it matter so much?

The shard key (or partition key) is the column (or combination of columns) used to decide which shard a row belongs to — e.g. hashing user_id. A poorly chosen shard key creates "hot shards" where one shard receives disproportionate traffic (e.g. sharding by signup date sends all of today's writes to one shard), which defeats the purpose of sharding. Choosing a shard key with high cardinality and even access patterns is the hardest part of a sharding project.

Do joins still work after horizontal partitioning?

Cross-shard joins are expensive or impossible at the database engine level — most sharded systems either duplicate small reference tables on every shard, denormalise data so joins aren't needed, or perform the join in the application layer by querying multiple shards and merging results. This is one of the biggest costs teams underestimate when sharding.

Can you combine vertical and horizontal partitioning?

Yes, and large systems often do. A common pattern is to vertically partition first — splitting hot and cold columns into separate tables — and then horizontally shard the hot table (which now has a smaller row size) across multiple nodes for write scalability, while the cold table stays on cheaper, less-sharded storage.

What is "resharding" and why is it painful?

Resharding means changing the number of shards or the shard key after data is already distributed — usually because a shard grew too hot or the cluster needs to grow. It requires moving live data between shards without downtime or data loss, which is why many teams over-provision shard count upfront (e.g. 4096 logical shards mapped to fewer physical nodes) so they can rebalance by remapping instead of re-splitting rows.

Database scaling comparison

Vertical vs horizontal partitioning

When a table gets too big, you split it. The question is which direction you cut: by columns (vertical) or by rows (horizontal — better known as sharding). They solve different problems, and mixing them up is one of the most common mistakes in "how would you scale this database" system-design interviews.

TL;DR

Vertical partitioning splits a table by columns — hot, frequently-used columns in one table, cold or large columns in another, joined by a shared key.
Horizontal partitioning (sharding) splits a table by rows — each partition/shard holds a subset of rows but every column, usually distributed across separate servers.
Different problems solved: vertical partitioning shrinks row size and working-set memory; horizontal partitioning scales total data volume and write throughput across machines.

Side-by-side comparison

Aspect	Vertical partitioning	Horizontal partitioning (sharding)
Splits by	Columns	Rows
Each partition contains	All rows, subset of columns	Subset of rows, all columns
Primary goal	Shrink row size, isolate rarely-used or large columns	Scale total storage and write throughput across machines
Typical trigger	Wide table with rarely-accessed large columns (blobs, JSON, history)	Table too large for one server, or write throughput exceeds one server's capacity
Key concept	Shared primary key joining the split tables	Shard key (partition key) that decides row placement
Cross-partition joins	Cheap — usually within one database instance	Expensive or app-level — often avoided by denormalising
Scales writes across servers?	No — all data still lives on one instance	Yes — this is the primary reason to shard
Failure risk if done poorly	Extra join overhead if the split boundary is wrong	Hot shards from a badly chosen shard key
Common real-world use	Separating profile "core fields" from "activity history"	Sharding users table by user_id hash across 16 database clusters

Code / schema side-by-side

Vertical partitioning

-- Before: one wide table
CREATE TABLE users (
  id, email, name, status,
  bio_text, avatar_blob, last_login_json
);

-- After: hot columns split from cold ones
CREATE TABLE users_core (
  id, email, name, status
);
CREATE TABLE users_profile_detail (
  user_id REFERENCES users_core(id),
  bio_text, avatar_blob, last_login_json
);

-- Hot-path query only touches the small table:
SELECT id, email, status FROM users_core
WHERE email = 'a@b.com';

Horizontal partitioning (sharding)

// Application-level shard routing
function shardFor(userId) {
  const NUM_SHARDS = 16;
  return hash(userId) % NUM_SHARDS;
}

function getUser(userId) {
  const shard = shardFor(userId);
  const conn = shardConnections[shard];
  return conn.query(
    "SELECT * FROM users WHERE id = $1",
    [userId]
  );
}
// users 1..1M live on shard 0..15, each on
// its own database server -- writes scale
// roughly linearly as shards are added

When to use vertical partitioning

Most queries only need a handful of "hot" columns. Splitting off large or rarely-read columns (bios, blobs, JSON history) shrinks the working set that has to fit in cache or memory for common queries.
You have a small number of very wide, infrequently accessed columns. Storing them separately avoids paying I/O cost for them on every row scan of the main table.
You want to apply different access controls or storage tiers per column group. Sensitive or bulky data can live in a separately secured or cheaper-storage table.
Single-server capacity is not yet the bottleneck. Vertical partitioning is a lighter-weight first step before reaching for sharding.

When to use horizontal partitioning (sharding)

Total data volume exceeds what one server can hold or serve efficiently. Sharding is the standard answer once vertical scaling (bigger machine) hits diminishing returns.
Write throughput exceeds a single primary's capacity. Distributing writes across shards is the main way to scale write-heavy workloads horizontally.
Your access pattern maps naturally to a partition key. Multi-tenant SaaS (shard by tenant_id) and user-centric apps (shard by user_id) are classic, low-risk sharding candidates.
You've already exhausted vertical partitioning and read replicas. Sharding adds real operational complexity, so it's usually the last lever pulled, not the first.

English phrases engineers use

Vertical partitioning conversations

"Let's split out the cold columns into a separate table."
"That blob field is bloating every row scan — vertically partition it."
"We joined back on the shared key to reassemble the full profile."
"The working set now fits in memory after we split the table."

Sharding conversations

"We picked user_id as the shard key — good cardinality, even distribution."
"Shard 7 is hot — one tenant is generating 40% of the traffic."
"This query needs a cross-shard join, which we try to avoid."
"We're planning a resharding to double our shard count next quarter."
"Over-provision logical shards now so rebalancing is a remap, not a re-split."

Quick decision tree

Rarely-used large columns bloating common queries → Vertical partitioning
Total data or write volume exceeds one server → Horizontal partitioning (sharding)
Need both smaller rows and more write throughput → Vertically partition first, then shard the hot table
Access pattern has an obvious high-cardinality key (tenant_id, user_id) → Shard on that key
Access pattern has no good shard key (e.g. only a boolean status) → Reconsider — pick a different key or avoid sharding
Not yet hitting single-server limits → Neither — read replicas or a bigger instance first

Frequently asked questions

Is horizontal partitioning the same thing as sharding?

Yes, in almost all everyday usage. "Horizontal partitioning" is the formal database term for splitting a table by rows, so each partition holds a subset of the rows but all of the columns. "Sharding" is the same technique applied at a larger scale, usually across separate database instances or servers rather than partitions within one instance. Engineers use the words interchangeably in most conversations.

What is vertical partitioning used for in practice?

Vertical partitioning splits a table by columns rather than rows — for example, keeping frequently-accessed columns (name, status) in one table and rarely-accessed, large columns (full address history, binary blobs, long text fields) in a separate table joined by a shared key. It reduces the amount of data scanned for common queries and can shrink the working set that fits in memory/cache.

Which one solves "my database is too big for one machine"?

Horizontal partitioning (sharding). Splitting rows across multiple database instances lets you scale storage and write throughput roughly linearly by adding more shards. Vertical partitioning does not solve this on its own — you still have all the rows, just spread across narrower tables — though it is often applied first to shrink each row before sharding.

Show more questions (4)