DSPy Framework Vocabulary: English for Declarative LLM Programming

Learn the essential English vocabulary for DSPy: signatures, modules, optimizers, teleprompters, few-shot compilation, and Chain of Thought in declarative LLM programming.

DSPy reimagines how engineers build LLM-powered systems by replacing hand-crafted prompts with a programming model that can optimise itself. Instead of manually writing and tweaking prompt strings, you define the behaviour you want — and DSPy finds the prompts and few-shot examples that achieve it.

If you are learning DSPy or joining a team that uses it, the vocabulary can be unfamiliar at first. This guide explains the key terms so you can read the documentation, participate in code reviews, and discuss DSPy architecture confidently.


Key Vocabulary

Signature

A signature in DSPy is a declarative specification of what an LLM call should do: its input fields, output fields, and a natural-language description of the task. Rather than writing a raw prompt, you define a signature like "question -> answer" and DSPy handles the prompt construction.

“Instead of maintaining a fragile prompt string, we defined a DSPy signature — if the underlying model changes, we just recompile rather than rewrite all our prompts.”

Module

A module is the building block of a DSPy programme. Each module wraps a call to a language model (or a composition of calls) and exposes a clean Python interface. Modules can be composed together, just like layers in a neural network, to build complex pipelines.

“Our pipeline has three modules in sequence: a retrieval module, a reasoning module, and a formatting module. Each one can be optimised independently.”

Optimizer (Teleprompter)

An optimizer (historically called a teleprompter in early DSPy versions) is an algorithm that automatically improves a DSPy programme. It searches over possible prompts, instructions, and few-shot demonstrations to maximise a metric you define. Examples include BootstrapFewShot, MIPRO, and BayesianSignatureOptimizer.

“We ran the BootstrapFewShot optimizer overnight and it found a set of demonstrations that improved our answer accuracy from 61% to 74% — without us writing a single example by hand.”

Compiled Program

A compiled program is a DSPy programme after it has been through the optimizer. The optimizer has filled in the concrete prompts, instructions, and demonstrations that the programme will use at inference time. Compilation is analogous to training in machine learning — you run it once and then deploy the compiled artefact.

“We store the compiled programme as a JSON file and load it at startup — recompilation only happens when the metric drops or we change the pipeline architecture.”

Few-Shot Compilation

Few-shot compilation is the process by which DSPy’s optimizer selects and bootstraps demonstration examples (few-shot examples) to include in the prompts of each module. Rather than manually curating examples, the optimizer generates candidate demonstrations using the pipeline itself and evaluates which ones improve the metric.

“The optimizer bootstrapped 50 candidate demonstrations from our training set and selected the 8 that most improved validation accuracy — that’s few-shot compilation in action.”

Chain of Thought

Chain of Thought (CoT) is a reasoning strategy where the LLM is prompted to write out intermediate reasoning steps before giving its final answer. In DSPy, ChainOfThought is a built-in module that adds a rationale output field to any signature, automatically prompting the model to reason step by step.

“Switching from Predict to ChainOfThought for our multi-step reasoning module improved accuracy by 12 percentage points — the model errors were happening because it was jumping to conclusions.”

Bootstrapping

Bootstrapping in DSPy refers to the process of generating labelled training examples automatically by running the pipeline on unlabelled inputs and keeping the outputs that satisfy the metric. It allows you to build a training set without manually annotating examples.

“We only had 20 manually labelled examples, but after bootstrapping we had 400 high-quality demonstrations for the optimizer to work with.”

Metric

A metric in DSPy is a function that scores a programme’s output — it takes the input, the expected output, and the actual output, and returns a score. The optimizer uses the metric to judge which prompts and demonstrations are best. Defining a good metric is one of the most important engineering decisions when using DSPy.

“Our metric checks three things: factual accuracy, citation presence, and response length. We spent more time designing the metric than building the pipeline itself.”


Useful Phrases

  • “The signature tells DSPy what the module needs to do — the optimizer figures out how to prompt the model to do it.”
  • “We haven’t touched a raw prompt string in months — everything goes through DSPy signatures and the optimizer handles the rest.”
  • “Recompilation took about two hours on our validation set, but now the programme adapts automatically when we swap the underlying model.”
  • “The ChainOfThought module improved performance significantly, but it roughly doubles token usage — we’re evaluating whether the quality gain is worth the cost.”
  • “Our metric is the most critical piece of the whole system — if the metric is wrong, the optimizer will confidently produce the wrong programme.”

Common Mistakes

Confusing “compiling” with traditional code compilation

When engineers first hear that DSPy “compiles” a programme, they sometimes think this means transpiling Python code or generating binary artefacts. In DSPy, compilation is an optimisation process that finds good prompts and demonstrations — not a traditional compile step. The output is a set of prompt strings and examples, not compiled code. Say “we compiled the DSPy programme against our validation set”, not “we built the DSPy binary”.

Using “teleprompter” in current documentation

Early DSPy versions used the term teleprompter for what is now called an optimizer. If you read older tutorials or papers, you will encounter this term. In current DSPy (v2.x), the correct term is optimizer. Using “teleprompter” in a modern code review or documentation may cause confusion for readers who only know the newer terminology.

Treating the metric as an afterthought

A common mistake is to build the pipeline first and define a rough metric quickly at the end. In DSPy, the metric is the specification — it determines what “better” means. A vague or poorly calibrated metric produces a compiled programme that scores highly on the metric but fails in practice. Teams that struggle with DSPy often have a metric problem, not a pipeline problem.


DSPy represents a significant shift in how LLM applications are engineered — from prompt artistry to software engineering. The vocabulary above captures the most important concepts in this model: you declare what you want (signatures), compose it into a pipeline (modules), define what “good” means (metrics), and let the optimizer find how to achieve it (compilation). Once these concepts click, the DSPy documentation becomes much easier to navigate.