Haystack 2.0: English for Building RAG Pipelines
Master English vocabulary for Haystack 2.0: pipelines, components, DocumentStore, generators, retrievers, rankers, and embedding models for RAG pipeline development.
Haystack 2.0 by deepset reimagined the original framework as a composable, component-based architecture for building production RAG systems. If you are working with Haystack in a multinational team or following the English-language documentation and community forums, you will need a firm grasp of the framework’s specific vocabulary. This guide covers the key terms you will encounter when designing, building, and debugging Haystack pipelines.
Key Vocabulary
Pipeline — the central abstraction in Haystack 2.0, a directed acyclic graph of connected components that data flows through from input to output.
Definition sentence: Rather than a linear sequence of steps, a Haystack Pipeline can branch, merge, and conditionally route data depending on intermediate results.
Example: “We built a separate indexing pipeline and a querying pipeline, connecting them through a shared document store.”
Component — a self-contained, reusable processing unit with declared inputs and outputs that can be connected to other components within a pipeline.
Definition sentence: Any Python class decorated with @component that exposes a run method becomes a valid Haystack component.
Example: “We wrote a custom component to strip personally identifiable information from documents before they enter the document store.”
DocumentStore — the storage backend responsible for persisting documents and their vector embeddings, supporting retrieval queries against them.
Definition sentence: Haystack supports multiple DocumentStore backends — including OpenSearch, Weaviate, Qdrant, and an in-memory store for local development.
Example: “The QdrantDocumentStore was chosen because it supports named vectors, which we needed for our hybrid retrieval strategy.”
Retriever — a component that queries the DocumentStore to return the most relevant documents for a given query, using dense vector search, sparse keyword search, or a combination of both.
Definition sentence: The retriever sits between the query input and the generator, narrowing a large document collection down to the most relevant context.
Example: “We replaced the dense retriever with a hybrid retriever after noticing that exact product-code queries were performing poorly with embeddings alone.”
Generator — a component that wraps a language model and produces text output, typically receiving retrieved documents as context alongside the user’s question.
Definition sentence: Haystack provides built-in generators for OpenAI, Anthropic, Cohere, and locally hosted models, all following the same interface.
Example: “We swapped in a local HuggingFaceLocalGenerator for our on-premises deployment without changing any other part of the pipeline.”
Ranker — a component positioned after the retriever that reorders retrieved documents by relevance using a cross-encoder model, typically more accurate than the initial retrieval but more expensive to run. Definition sentence: Adding a ranker improves the quality of the context window passed to the generator, especially when the retriever returns a large candidate set. Example: “The ranker reduced the top-20 retrieval results down to the top-5 before they were passed to the generator, cutting prompt length and improving answer quality.”
Embedding model — a model that converts text into dense numerical vectors, enabling semantic similarity search in the document store. Definition sentence: The same embedding model must be used during indexing and at query time, otherwise the vector representations are not comparable. Example: “We benchmarked three embedding models on our domain-specific test set before settling on one that balanced accuracy and inference speed.”
Hybrid retrieval — a retrieval strategy that combines dense vector search with sparse keyword-based search (such as BM25), then merges and re-ranks the results to capture both semantic and lexical relevance. Definition sentence: Hybrid retrieval is particularly valuable for technical domains where exact terminology matters as much as conceptual similarity. Example: “Hybrid retrieval improved our recall on queries containing specific API method names, which the dense embeddings had been handling poorly.”
Useful Phrases
- “The indexing pipeline runs as a batch job every night; the querying pipeline is served as a FastAPI endpoint that handles live user requests.”
- “We wired the retriever’s output directly into the ranker’s input, then passed the ranker’s output to the prompt builder before the generator.”
- “If you change the embedding model, you need to re-index all your documents — the existing vectors in the store will be incompatible.”
- “We’re using the in-memory document store for local development and integration tests, then swapping in Qdrant for staging and production.”
- “The custom component implements the
@componentdecorator and exposes arunmethod that returns a typed dictionary — Haystack validates the output schema at pipeline build time.”
Common Mistakes
Confusing “retriever” and “ranker”
Both components deal with selecting relevant documents, but they operate at different stages and with different mechanisms. A retriever performs a fast approximate search over the entire document store. A ranker performs a slower, more precise reordering of a small candidate set already returned by the retriever. Saying “the ranker searches the database” is incorrect — say “the ranker reorders the candidates returned by the retriever.”
Using “pipeline” loosely to mean “any process”
In everyday English, pipeline is used metaphorically for any sequence of steps. In Haystack, Pipeline has a specific technical meaning: it is an instantiated object with a graph structure, validated connections, and a run method. When discussing Haystack specifically, reserve the word pipeline for this technical object rather than using it as a general metaphor, to avoid confusion in team discussions.
Misusing “connect” versus “wire up”
Both phrases are used when linking components in Haystack, but connect is the more neutral, formal term (and the name of the actual method: pipeline.connect(...)). Wire up is idiomatic and informal, common in spoken conversation and informal documentation. In written technical specifications, prefer connect; in conversation and code-review chats, both are natural.
Haystack 2.0’s composable architecture reflects a broader trend in the industry towards modular, observable AI systems. Mastering this vocabulary not only helps you use the framework more effectively — it also equips you to describe your system’s design clearly to stakeholders, to write accurate technical documentation, and to contribute meaningfully to the open-source community around RAG development.