Build fluency in the vocabulary of keeping several candidate partial sequences alive during sequence decoding.
0 / 5 completed
1 / 5
At standup, a dev mentions a decoding strategy that keeps the top-k highest-scoring partial sequences at each generation step, expanding all of them, instead of greedily committing to only the single best next token at every step. What is this strategy called?
Beam search is exactly this: beam search keeps the top-k highest-scoring partial sequences, called the beam, at each generation step, expanding all of them to the next step, instead of greedily committing to only the single best next token at every step the way greedy decoding does. A hash collision is an unrelated hash-table concept about two keys sharing a bucket. This keep-k-candidates-alive approach is exactly why beam search often finds a higher-scoring overall sequence than greedy decoding, which can get stuck after one locally optimal but globally poor choice.
2 / 5
During a design review, the team switches a translation model's decoder from greedy decoding to beam search, specifically because keeping several candidate partial sequences alive avoids getting permanently stuck after one early token choice that looked good locally but hurt the overall sequence. Which capability does this provide?
Beam search here provides avoiding early, locally optimal choices that hurt the overall sequence, since keeping several candidate partial sequences alive lets the decoder recover from a token that looked good in isolation but leads to a worse overall translation. Greedy decoding commits to the single best next token at every step and has no way to reconsider that choice once later tokens reveal it was a mistake. This keep-multiple-candidates-alive behavior is exactly why beam search typically produces higher-quality output than greedy decoding for tasks like machine translation.
3 / 5
In a code review, a dev notices a translation model's decoder commits to only the single highest-scoring next token at every step and discards every other candidate immediately, instead of keeping the top-k candidate partial sequences alive with beam search. What does this represent?
This is a missed beam-search opportunity, since keeping the top-k candidate partial sequences alive would let the decoder recover from an early token choice that looks good locally but hurts the overall sequence. A cache eviction policy is an unrelated concept about discarded cache entries. This commit-to-one-candidate pattern is exactly the kind of quality gap a reviewer flags once output quality matters more than raw decoding speed.
4 / 5
An incident report shows a translation model's output quality dropped sharply on longer sentences, because its decoder committed to only the single highest-scoring next token at every step and had no way to recover once an early token choice turned out to hurt the overall sequence. What practice would prevent this?
Switching to beam search keeps several candidate partial sequences alive, letting the decoder recover from an early token choice that hurt the overall sequence. Continuing to commit to only the single highest-scoring next token at every step regardless of how much longer sentences suffer from that early commitment is exactly what caused the quality drop described in this incident. This keep-multiple-candidates approach is the standard fix once greedy decoding is confirmed to hurt output quality on longer sequences.
5 / 5
During a PR review, a teammate asks why the team reaches for beam search instead of greedy decoding, given that greedy decoding is simpler and faster since it only ever tracks one candidate sequence. What is the reasoning?
Beam search keeps several candidate partial sequences alive at each step, typically finding a higher-scoring overall sequence at the cost of more computation per step, while greedy decoding is simpler and faster but can get permanently stuck after one locally optimal but globally poor token choice. This is exactly why beam search is favored when output quality matters most, while greedy decoding remains attractive when speed is the priority.