Build fluency in the vocabulary of splitting a topic into ordered, independently scalable partitions.
0 / 5 completed
1 / 5
At standup, a dev mentions a Kafka topic split into several ordered logs distributed across different brokers, so the topic's overall throughput can scale beyond what a single log could handle. What is each of these ordered logs called?
A partition is one of the ordered logs a Kafka topic is split into, with each partition distributed across the cluster's brokers, letting the topic's overall throughput scale by adding more partitions and brokers rather than being limited to a single log's capacity. A consumer group describes a set of consumers sharing the work of reading a topic, which is a related but distinct concept from the topic's own partitioning. This partitioning is the foundational mechanism behind how Kafka achieves horizontal scalability for a single topic.
2 / 5
During a design review, the team picks a partition key, like a customer ID, ensuring every event for the same customer always lands in the same partition and is therefore processed in order relative to each other. Which capability does this key choice provide?
This partition key choice provides guaranteed per-key ordering, since Kafka only guarantees ordering within a single partition, meaning every event that hashes to the same partition, like every event for the same customer ID, is processed in the order it was produced relative to the other events for that same key. Kafka does not guarantee ordering across the entire topic, since different partitions can be consumed independently and in parallel. This is why picking a partition key that groups related events together is essential whenever relative ordering between those specific events actually matters.
3 / 5
In a code review, a dev notices a low-cardinality partition key, like a boolean 'isPremium' flag, is being used to route events, so nearly all traffic lands in just two of the topic's many partitions. What does this represent?
This is a hot partition problem caused by a low-cardinality key that fails to spread traffic evenly, since a boolean flag only has two possible values and therefore can only ever route events into at most two of the topic's partitions, no matter how many partitions the topic actually has. A consumer group rebalance is a different concept about consumers shifting which partitions they're assigned to. This hot-partition issue leaves most of the topic's partitions, and the brokers hosting them, sitting idle while a couple of partitions bear nearly all the load.
4 / 5
An incident report shows one partition's consumer lagged further and further behind while every other partition's consumer stayed caught up, because a low-cardinality partition key concentrated the vast majority of events onto that single partition. What practice would prevent this?
Choosing a higher-cardinality partition key distributes events more evenly across all of the topic's partitions, since a key with many more possible distinct values naturally spreads traffic instead of concentrating it onto just one or two partitions. Continuing to use the same low-cardinality key regardless of the resulting imbalance is exactly what caused one partition's consumer to fall further and further behind in this incident. This higher-cardinality key selection is the standard fix once a hot partition has been identified as the cause of an uneven consumer lag.
5 / 5
During a PR review, a teammate asks why the team deliberately picks a partition key based on customer ID instead of just letting Kafka round-robin every event across partitions with no key at all. What is the reasoning?
Round-robin assignment with no key spreads load very evenly across partitions but gives up any ordering guarantee between related events, since two events for the same customer could land on entirely different partitions and be processed out of order relative to each other. Keying by customer ID sacrifices some of that perfectly even load balancing, since a few customers might generate disproportionately more events than others, in exchange for guaranteeing every event for the same customer lands in the same partition and stays in order. The tradeoff is exactly this: pick a key when relative ordering matters, and accept the risk of a hot partition if that key's value distribution turns out to be uneven.