Build fluency in the vocabulary of two unrelated variables slowing each other down on one shared cache line.
0 / 5 completed
1 / 5
At standup, a dev mentions two threads on different CPU cores each writing to their own separate variable, yet performance is far worse than expected because both variables happen to sit on the same CPU cache line. What is this problem called?
False sharing happens exactly here: two threads write to genuinely separate, unrelated variables, but because those variables happen to sit on the same cache line, every write from one core forces the cache-coherence protocol to invalidate that whole line on the other core, even though the two threads never actually touch each other's data. A deadlock is an unrelated concurrency failure about threads blocking rather than slowing each other down. This is called "false" sharing precisely because the variables aren't logically shared at all, only physically co-located on the same cache line.
2 / 5
During a design review, the team pads a frequently-written-to variable with extra unused bytes so it lands on its own dedicated cache line, away from any other thread's frequently-written variable. Which capability does this padding provide?
This padding provides avoiding false sharing, since giving each thread's frequently-written variable its own dedicated cache line means one thread's write can no longer force an invalidation of the cache line the other thread is reading or writing, because they're no longer physically co-located at all. Leaving the two variables tightly packed together is exactly the layout that causes repeated, needless cache-coherence traffic between the cores. This padding technique is a standard, if slightly memory-wasteful, fix once false sharing has been identified as a real bottleneck.
3 / 5
In a code review, a dev notices a struct holds two counters, each incremented by a different thread on a different core, packed tightly next to each other with no padding at all. What does this represent?
This is a layout at risk of false sharing, since packing two counters tightly together, each incremented by a different thread on a different core, makes it likely they land on the same cache line, so every increment from either thread forces a needless cache-coherence invalidation on the other core. A cache eviction policy is an unrelated concept about discarded cache entries, not CPU cache-line contention. This tight, padding-free packing is exactly the pattern a performance-focused reviewer would flag once two hot, independently-written fields sit this close together in memory.
4 / 5
An incident report shows a multi-threaded counting service scaled far worse than expected as more cores were added, because a profiler traced the bottleneck to two independently-updated per-core counters that happened to sit on the same cache line, forcing constant cache-coherence traffic between cores. What practice would prevent this?
Padding each per-core counter so it occupies its own dedicated cache line removes the needless cache-coherence invalidation traffic between cores, since a write to one counter can no longer force the other core to discard and reload its own copy of a line it never actually shares any real data with. Continuing to pack the counters tightly together with no padding is exactly what caused the poor scaling described in this incident as more cores were added and contention on that one shared line grew. This per-core padding is the standard fix once false sharing has been confirmed as the actual bottleneck by a profiler.
5 / 5
During a PR review, a teammate asks why the team pads hot per-core counters onto separate cache lines instead of just trusting that logically independent variables will never actually interfere with each other's performance. What is the reasoning?
Cache-coherence hardware operates at the granularity of a whole cache line, not an individual variable, so it has no way to know that two variables sharing a line are logically unrelated, and it invalidates the entire line the instant either core writes to any part of it. Two logically independent variables can therefore still contend heavily purely because of how they happen to be laid out in memory, with no actual data dependency between them at all. The tradeoff is the small amount of extra memory padding wastes, which is a cheap price for removing a scaling bottleneck that would otherwise get worse as more cores are added.