AdvancedVocabulary#data-science-ml#backend#developer-tools

Beam Search Vocabulary

Build fluency in the vocabulary of keeping several candidate partial sequences alive during sequence decoding.

0 / 5 completed
1 / 5
At standup, a dev mentions a decoding strategy that keeps the top-k highest-scoring partial sequences at each generation step, expanding all of them, instead of greedily committing to only the single best next token at every step. What is this strategy called?