LLM Context Window Management Language Collocations
Practise the standard verbs for managing an LLM's context window under a token limit.
0 / 5 completed
1 / 5
Fill in: 'We ___ the conversation history to the most relevant recent turns so a long-running chat doesn't quietly exceed the model's context window mid-conversation.'
We 'trim history' — the standard, established collocation for shortening context to fit a model's limit. The other options aren't the recognised term here.
2 / 5
Fill in: 'Appending every retrieved document to the prompt without a limit can ___ the context window filled before the model even reaches the actual user question.'
We say unlimited appending will 'leave' the window filled before the real question — the standard, natural collocation for the resulting problem. The other options aren't idiomatic here.
3 / 5
Fill in: 'We ___ a strict token budget per prompt section so retrieved context can never crowd out the system instructions or the user's own message.'
We 'budget tokens' — the standard, simple collocation for allocating a fixed share of context per component. The other options are less idiomatic here.
4 / 5
Fill in: 'We ___ token counts before every request, since an approximation that's off by a few hundred tokens can silently push a long prompt over the model's hard limit.'
We 'monitor' token counts — the standard collocation for ongoing observation of a usage metric approaching a limit. The other options aren't idiomatic here.
5 / 5
Fill in: 'We ___ the oldest turns of a long conversation into a short summary rather than dropping them outright, so earlier context isn't simply lost once it falls out of the window.'
We 'summarize turns' — the standard, established collocation for compressing older context to preserve it within a limited window. The other options aren't the recognised term here.