RAG Pipeline Vocabulary
5 exercises — Master the English vocabulary for describing RAG architectures, chunking strategies, embedding models, and retrieval quality.
0 / 5 completed
Quick reference: RAG Pipeline
- chunking — splitting source documents into segments for embedding and indexing
- embedding — converting a text chunk into a dense numerical vector encoding its meaning
- vector store — a database optimised to store and search embedding vectors (pgvector, Pinecone, Weaviate)
- cosine similarity — similarity metric: cos(angle) between two vectors; 1.0 = identical direction
- hybrid retrieval — combining dense (semantic) and sparse (BM25 keyword) search scores
1 / 5
During a RAG architecture review, a teammate says: "Our chunks are 2,000 tokens with no overlap — retrieval quality is terrible because the answer spans a chunk boundary." Which change best addresses this problem?
Chunk overlap prevents information loss at boundaries.
When a chunk boundary falls in the middle of a fact or sentence, neither chunk contains the full context needed to answer a query. Adding chunk overlap — duplicating a portion of tokens between adjacent chunks (typically 10–20% of chunk size) — ensures boundary content appears in at least one chunk.
Chunk size controls the granularity of retrieval: smaller chunks improve precision (less noise per chunk) but can miss multi-sentence context; larger chunks capture more context but dilute the signal. Overlap is the standard mitigation for boundary splitting.
Embedding dimension, distance metric, and temperature are separate concerns unrelated to chunk boundary splits.
Key vocabulary:
• chunk size — number of tokens (or characters) in each text segment fed to the embedder
• chunk overlap — number of tokens shared between consecutive chunks to avoid information loss at boundaries
• chunk boundary — the edge between two adjacent chunks, where split answers may be lost
When a chunk boundary falls in the middle of a fact or sentence, neither chunk contains the full context needed to answer a query. Adding chunk overlap — duplicating a portion of tokens between adjacent chunks (typically 10–20% of chunk size) — ensures boundary content appears in at least one chunk.
Chunk size controls the granularity of retrieval: smaller chunks improve precision (less noise per chunk) but can miss multi-sentence context; larger chunks capture more context but dilute the signal. Overlap is the standard mitigation for boundary splitting.
Embedding dimension, distance metric, and temperature are separate concerns unrelated to chunk boundary splits.
Key vocabulary:
• chunk size — number of tokens (or characters) in each text segment fed to the embedder
• chunk overlap — number of tokens shared between consecutive chunks to avoid information loss at boundaries
• chunk boundary — the edge between two adjacent chunks, where split answers may be lost