Advanced LLM App Development #rag#embeddings#vector-database#retrieval

RAG Pipeline Vocabulary

5 exercises — Master the English vocabulary for describing RAG architectures, chunking strategies, embedding models, and retrieval quality.

0 / 5 completed

1 / 5

During a RAG architecture review, a teammate says: "Our chunks are 2,000 tokens with no overlap — retrieval quality is terrible because the answer spans a chunk boundary." Which change best addresses this problem?

2 / 5

A colleague explains: "We embed using text-embedding-3-large — it maps each chunk to a 3,072-dimensional vector that captures semantic similarity." What does the phrase "captures semantic similarity" mean precisely?

3 / 5

In an architecture discussion, the team debates vector database options. A senior engineer says: "We're on Postgres already — pgvector avoids operational overhead. But if we need ANN at billion-scale with metadata filtering, Pinecone or Weaviate might win." What does ANN mean in this context?

4 / 5

During a retrieval quality retrospective, an engineer reports: "Our precision@5 is high but our recall is low — we're returning relevant documents when we retrieve, but we're missing a lot of relevant documents entirely." Which metric best captures what they're missing?

5 / 5

A teammate proposes a solution for handling both keyword-specific and semantic queries: "We'll run dense semantic search alongside sparse BM25, then combine the scores — hybrid retrieval usually beats pure vector search for technical documentation." When does hybrid retrieval offer the biggest advantage over pure vector search?