Practice context window vocabulary: token limits, document chunking, batch processing, attention degradation in long contexts, and context compression strategies.
0 / 5 completed
1 / 5
'The context window is 128K _____.' What unit measures the amount of text a model can process at once?
Tokens are the units LLMs process — roughly 3/4 of a word in English. A 128K token context window can hold approximately 100,000 words or ~300 pages of text.
2 / 5
'The document exceeds the _____ window.' What is the limit being exceeded?
When a document is too large for the model to process in one call, it 'exceeds the context window' — requiring chunking, summarization, or retrieval strategies.
3 / 5
'We _____ the document and process in batches.' What technique handles oversized documents?
'Chunking' means splitting a large document into smaller, overlapping pieces that fit within the context window, then processing each chunk separately.
4 / 5
'The model's attention degrades in the middle of a long context' — this is known as the _____ problem.
The 'lost in the middle' problem refers to research showing LLMs perform worse on information placed in the middle of long contexts vs. the beginning or end.
5 / 5
What does 'context compression' refer to?
Context compression strategies (summarization, selective retrieval, prompt pruning) maximize the useful information that fits within the context window.