LangChain English: LCEL and RAG Pipeline Vocabulary

Learn the English vocabulary used in LangChain development — LCEL chains, RAG pipelines, retrievers, memory, and agent vocabulary explained in professional context.

Introduction

LangChain is the most widely used framework for building LLM-powered applications in Python. It has developed its own vocabulary, and much of that vocabulary has become standard across the broader LLM engineering community. If you work on a team building RAG systems, agents, or LLM pipelines, you will encounter terms like LCEL, retrievers, chains, and memory in daily discussions. Understanding how engineers use these terms precisely will help you contribute to code reviews, architecture documents, and technical discussions with confidence.

LCEL: LangChain Expression Language

LCEL (pronounced as individual letters: L-C-E-L) is LangChain’s declarative way of composing pipelines using the pipe operator. Engineers describe it with phrases like:

  • “We compose the chain using LCEL” — build a pipeline by chaining components with the | operator
  • “The chain is lazy” — LCEL chains do not execute until you call .invoke(), .stream(), or .batch()
  • “We pipe the retriever into the prompt, then into the model” — describing the data flow through the pipeline
  • “LCEL gives us streaming and async support out of the box” — a common stated advantage

The pipe metaphor is strong here. Engineers say “we pipe through” when describing data flowing from one component to the next. A typical description: “We pipe the question through the retriever, combine the retrieved documents with the question in a prompt template, pipe that into the model, and pipe the output through a string parser.”

The word runnable is important in LCEL. Every component in an LCEL chain implements the Runnable interface, which means it can be invoked, streamed, and batched uniformly.

RAG Pipelines

RAG (Retrieval Augmented Generation) is one of the most common use cases for LangChain. The vocabulary is extensive and important:

  • retriever — a component that fetches relevant documents given a query; “we use a vector store retriever backed by Chroma”
  • vector store — a database that stores document embeddings for semantic search
  • embed — convert text into a numerical vector; “we embed the documents before indexing them”
  • chunk — a segment of a larger document; “we split the PDF into 512-token chunks with 50-token overlap”
  • overlap — how much the end of one chunk and the start of the next chunk share; reduces information loss at boundaries
  • rerank — a second-stage sorting of retrieved documents by relevance; “we rerank the top-20 results to get the best 5”
  • context window stuffing — fitting retrieved documents into the LLM’s context; “we need to truncate chunks to avoid context window overflow”

Engineers say: “Our RAG pipeline embeds the user’s question, retrieves the top-5 most similar chunks from the vector store, stuffs them into the prompt as context, and asks the model to answer based only on the provided context.”

Memory and State

LangChain provides memory components for maintaining conversation history. The vocabulary:

  • conversation memory — stores past exchanges to give the model context from earlier turns
  • “We trim the memory to the last N exchanges” — avoid exceeding the context window
  • “The memory is injected into the prompt” — conversation history is formatted and added to the system or human message
  • stateless — a chain with no memory, treating every call independently; “this Q&A chain is stateless — each question is answered independently”

Agents and Tools

LangChain agents use LLMs to decide which tools to call. The vocabulary:

  • agent — an LLM that observes context, reasons about it, and selects tools to call
  • tool — a function that an agent can invoke, with a name and description
  • agent executor — the runtime loop that calls the agent and executes its chosen tools
  • “The agent reasons about which tool to use” — describes the ReAct pattern
  • “We give the agent a scratchpad” — intermediate reasoning steps visible during agent execution
  • final answer — the agent’s response after completing its reasoning and tool calls

Key Vocabulary

TermDefinition
LCELLangChain Expression Language, a declarative syntax for composing pipelines
RunnableThe base interface that all LCEL components implement
retrieverA component that fetches relevant documents for a given query
embedConvert text to a numerical vector for semantic similarity search
chunkA segment of a document, produced by a text splitter
overlapShared content between adjacent chunks to preserve context at boundaries
rerankA second-pass relevance sorting of retrieved documents
RAGRetrieval Augmented Generation — grounding LLM responses in retrieved documents
agentAn LLM that selects and calls tools to complete a task
agent executorThe loop that runs an agent and handles tool calls

Practice Tips

  1. Draw your RAG pipeline as a diagram and narrate it in English. Practise saying: “The user’s query is embedded, passed to the retriever, which returns the top-5 chunks from the vector store. The chunks and the original query are formatted into a prompt template, which is sent to the model. The model’s response is parsed and returned to the user.”

  2. Read LangChain’s “How-to guides” in English. These are practical, well-written, and use consistent vocabulary. Notice how they describe chain composition with the pipe operator.

  3. Use “retriever” not “search.” Engineers who know LangChain say “we use a retriever” not “we search the database.” Using the correct abstraction name signals familiarity with the framework.

  4. Practise explaining chunking strategy trade-offs. A common interview and architecture discussion topic: “Larger chunks provide more context but may include irrelevant information. Smaller chunks are more precise but may lose context. We use overlap to reduce information loss at boundaries.”

Conclusion

LangChain’s vocabulary — LCEL, retriever, chunk, embed, RAG, agent — has become standard across the LLM engineering community. Mastering these terms helps you read documentation, understand code reviews, and write clearer architecture documents. As LLM applications move from prototypes to production, precise communication about pipeline components becomes increasingly important for team alignment and effective debugging.