Learn the vocabulary of a fixed set of reusable worker threads pulling tasks off a shared queue.
0 / 5 completed
1 / 5
At standup, a dev mentions a fixed set of worker threads created once at startup, pulling tasks off a shared queue, rather than the server spawning a brand-new thread for every single incoming request. What is this pattern called?
A thread pool is a fixed set of worker threads created once at startup that pull tasks off a shared queue, reusing the same threads across many requests instead of paying the cost of creating and tearing down a brand-new thread for every single one. A deadlock is an unrelated failure mode about threads blocking each other, not about how threads are provisioned. This reuse is what makes a thread pool dramatically cheaper under high request volume than spawning a thread per request.
2 / 5
During a design review, the team wants the pool's size capped at a fixed maximum, with excess tasks queued rather than spawning unbounded additional threads under a traffic spike. Which capability supports this?
A bounded thread pool with a fixed maximum size, backed by a task queue that absorbs any burst beyond that capacity, keeps memory and CPU usage predictable even under a sudden traffic spike, since excess tasks simply wait in the queue instead of triggering more threads than the system can handle. An unbounded pool that spawns a new thread per task risks exhausting memory or CPU entirely once traffic spikes hard enough. This bounded-size-plus-queue design is the standard way to keep a thread pool's resource usage under control.
3 / 5
In a code review, a dev notices every single task submitted to the pool blocks on a slow downstream call, and once enough tasks pile up, the pool's threads are all occupied waiting on that same slow dependency. What does this represent?
This is thread pool starvation, where every worker ends up blocked on the same slow dependency, leaving no thread free to pick up any other queued task, even ones that have nothing to do with the slow call. A cache eviction policy is an unrelated concept about discarded cache entries. This starvation scenario is exactly why a slow downstream dependency needs its own timeout, or its own isolated pool, so it can't monopolize every worker thread in the whole system.
4 / 5
An incident report shows the entire API became unresponsive because a single slow downstream dependency caused enough tasks to block that every thread in the shared pool was eventually occupied waiting on it, starving unrelated requests. What practice would prevent this?
Isolating the slow dependency's calls behind their own dedicated thread pool, a pattern often called a bulkhead, ensures that dependency's blocking can only exhaust its own smaller pool rather than starving every worker the rest of the API relies on. Continuing to route everything through one shared pool with no isolation is exactly what let one slow dependency take down the entire API in this incident. This isolation is a standard resilience pattern for any system where different downstream calls have very different latency and failure characteristics.
5 / 5
During a PR review, a teammate asks why the team sizes the thread pool carefully and monitors its queue depth instead of just setting the maximum size as high as possible to avoid ever queuing a task. What is the reasoning?
An oversized pool can spin up more threads than the CPU can actually schedule efficiently, or more concurrent calls than a downstream dependency can absorb, so simply maximizing the pool's size just shifts the bottleneck elsewhere instead of eliminating it. A carefully sized pool paired with monitored queue depth lets the team see contention building up and react, whether by scaling out, adding a bulkhead, or tuning the size, before it turns into an outage. The tradeoff is the ongoing tuning effort of finding and re-validating the right pool size as traffic patterns and downstream dependencies change over time.