Practice English vocabulary for LLM context windows: context length limits, prompt construction, context stuffing, chunking strategies, and context window management.
0 / 5 completed
1 / 5
What is 'the context window' of an LLM?
The context window is the LLM's working memory for a single call. Everything the model can 'see' when generating a response must fit within the limit (e.g., 128K tokens for Claude 3). Input + output combined cannot exceed this limit.
2 / 5
What does 'prompt construction' mean in the context of LLM applications?
Prompt construction is an engineering concern: assembling components in the right order, managing token budgets for each section (system: 500 tokens, history: 2000, context: 8000, query: 200), and ensuring the result fits the context window while maximizing information quality.
3 / 5
What is 'context stuffing' and why is it problematic?
Context stuffing (adding large amounts of raw text) seems intuitive but hurts quality: models lose focus in very long contexts (the 'lost in the middle' effect), it increases latency and cost, and it prevents targeted retrieval. Curated, relevant context outperforms raw volume.
4 / 5
What is 'chunking strategy' for documents in a RAG pipeline?
Chunking strategy significantly impacts RAG quality. Too-small chunks lack context; too-large chunks include irrelevant content that dilutes the signal. Semantic chunking (splitting at natural boundaries) outperforms fixed-size chunking for most document types.
5 / 5
What does 'context window management' mean in a multi-turn application?
In long conversations, history grows until it exceeds the context window. Strategies: sliding window (drop oldest turns), summarization (compress old history into a summary), selective memory (keep only turns with tool calls or key facts), and vector memory (retrieve relevant past turns).