AdvancedVocabulary#ai#backend#developer-tools

Speculative Decoding Vocabulary

Learn the vocabulary of speeding up language model generation with a smaller draft model.

0 / 5 completed
1 / 5
At standup, a dev mentions using a smaller, faster draft model to propose several tokens ahead, which a larger model then verifies in a single pass to speed up generation. What technique is this?