Writing LLVM Pass Documentation: Vocabulary and Patterns for Compiler Engineers

A guide to writing clear LLVM pass documentation — pass description vocabulary, intent statements, transforms vs analyses, requirements, and documentation patterns for compiler engineers.

Why LLVM Pass Documentation Is Hard to Write

LLVM pass documentation has a reputation for being terse to the point of incomprehensibility, or verbose in the wrong places. Engineers writing a new optimisation pass often document the implementation thoroughly and the intent hardly at all. Readers of the documentation — who may be trying to understand whether the pass is safe to run on their IR, or in what order to schedule it — are left to reverse-engineer the intent from the implementation.

Good LLVM pass documentation answers four questions before the reader reaches the implementation:

  1. What does this pass do? (A one-sentence description of the transformation or analysis)
  2. When should it run? (Prerequisites and pass ordering requirements)
  3. What does it preserve? (Which analyses remain valid after the pass)
  4. Are there any important limitations or assumptions? (Edge cases and known issues)

The Vocabulary of Pass Intent

Describing Transformations

A transform pass modifies the IR. Documentation for a transform pass uses this vocabulary:

VerbUsage
transformsThe pass changes the structure of the IR in a significant way
replacesOne construct is substituted for another
eliminatesA construct is removed (dead code elimination, redundancy elimination)
hoistsA computation is moved to an earlier point (earlier in the function, or out of a loop)
sinksA computation is moved to a later point
foldsConstant expressions are evaluated at compile time
inlinesA call site is replaced with the callee’s body
canonicalisesThe IR is brought into a standard normalised form
lowersA high-level construct is replaced with a lower-level equivalent
decomposesA complex instruction or pattern is split into simpler parts
mergesMultiple constructs are combined into a single one
vectorisesScalar operations are transformed into vector operations

Example intent statement using this vocabulary: “This pass hoists loop-invariant loads out of inner loops and into the loop preheader, eliminating redundant memory accesses in cases where the loaded address and the loaded value are both provably unchanged across loop iterations.”

Describing Analyses

An analysis pass computes information without modifying the IR. Documentation for an analysis pass uses this vocabulary:

VerbUsage
computesThe pass calculates a property of the IR
determinesThe pass resolves a question about the IR
collectsThe pass gathers a set of facts
identifiesThe pass finds instances of a pattern
annotatesThe pass adds metadata to IR elements
approximatesThe pass produces a conservative estimate of a property

Example intent statement: “This analysis computes the alias sets for all pointer-valued instructions in a function, producing a conservative approximation of the memory access relationships that subsequent transform passes can query.”


Writing Pass Descriptions in LLVM Style

LLVM pass descriptions in source code headers and documentation follow a consistent style. Studying existing LLVM passes is the best way to calibrate your own writing.

The One-Sentence Summary

This appears in the pass registry, the --help output, and the top of the class documentation. It must be:

  • A complete sentence
  • Accurate and specific
  • Free of implementation detail
Weak summaryStrong summary
”Does mem2reg stuff.""Promotes memory references to register references, eliminating alloca/load/store patterns that are amenable to SSA construction."
"Optimises loops.""Performs loop-invariant code motion, moving computations whose operands do not change across loop iterations into the loop preheader.”

Prerequisites and Requirements

Document prerequisites using “requires” and “assumes”:

  • “This pass requires that the input IR is in SSA form. Run the mem2reg pass or equivalent before scheduling this pass.”
  • “This pass assumes that function arguments do not alias any global variables. If this assumption may be violated, disable the pass for affected functions using the ‘noalias-args’ attribute.”

Preservation Statements

After a transform pass runs, some analyses remain valid and some are invalidated. Document this using “preserves” and “invalidates”:

  • “This pass preserves the dominator tree, as it does not modify the control flow graph.”
  • “This pass invalidates alias analysis results, as it may introduce new pointer-producing instructions.”
  • “This pass does not modify the IR if no eligible patterns are found; in that case, all analyses are preserved.”

Documenting Limitations and Edge Cases

Limitations are one of the most important and most underwritten sections of pass documentation. Common patterns:

SituationDocumentation phrase
The pass is conservative”This pass uses a conservative alias analysis and may fail to eliminate some provably safe patterns.”
The pass does not handle a case”This pass does not handle indirect calls. Function pointer calls are left unchanged.”
A known interaction with another pass”Running this pass after [X] may produce suboptimal results; schedule it before [X] for best effect.”
A performance cliff”The analysis has quadratic worst-case complexity in the number of pointer-producing instructions. For functions with more than 10,000 instructions, consider using the interprocedural alias analysis instead.”

Documentation Patterns From the LLVM Codebase

Study these existing pass descriptions from the LLVM source as models:

  • InstCombine: Combines instructions into more efficient forms; the description carefully distinguishes what it canonicalises versus what it optimises.
  • LICM (Loop Invariant Code Motion): Documents the specific conditions under which a computation qualifies for hoisting.
  • GVN (Global Value Numbering): Documents the relationship between value numbering and load elimination.

Each of these passes has a clear intent statement, explicit prerequisites, and documented limitations. Aim for the same.


Example LLVM Pass Documentation Sentences

  1. “This pass transforms switch instructions with contiguous integer case ranges into lookup tables, replacing an O(n) branch sequence with an O(1) memory access.”
  2. “The pass requires LoopAnalysis and ScalarEvolution to be available; it will not run on loops for which ScalarEvolution cannot compute a trip count.”
  3. “Induction variable simplification canonicalises all loop induction variables to start at zero and increment by one, simplifying subsequent vectorisation and loop unrolling passes.”
  4. “This analysis conservatively marks a pointer as ‘may alias’ when its alias relationship cannot be determined statically; pass authors who need a more precise result should request the AliasAnalysis with per-call-site context.”
  5. “This pass invalidates the MemorySSA analysis on any function it modifies; downstream passes that depend on MemorySSA must request a fresh analysis after this pass runs.”

Style and Register Notes

LLVM documentation uses a neutral, impersonal technical register. Avoid first person (“I wrote this pass to…”) and conversational asides (“basically, what this does is…”).

Prefer:

  • Present tense for describing what the pass does: “This pass eliminates…”
  • Conditional for edge cases: “If the trip count is unknown, the pass skips the loop.”
  • Imperative for instructions to users: “Run this pass after mem2reg.”

The documentation will be read by engineers from many backgrounds and English proficiencies. Plain, precise, unambiguous English serves all of them better than idiomatic or colloquial prose.