English for Jaeger Tracing Developers
Learn the English vocabulary for Jaeger: spans, traces, sampling, and explaining distributed tracing to a team debugging cross-service latency.
Jaeger conversations happen mostly during incident debugging, when a request crosses several services and no single log tells the whole story, so the vocabulary centers on spans, trace context, and sampling trade-offs.
Key Vocabulary
Span — a single timed operation within a trace, representing one unit of work (a function call, a database query, an HTTP request) with a start time, duration, and associated metadata. “Add a span around this database call specifically — right now the parent span just shows ‘slow,’ and we can’t tell if it’s the query itself or something else inside that block.”
Trace — the full collection of spans representing one request’s journey across all the services it touched, letting you see end-to-end where time was actually spent. “Pull up the trace for this slow request — instead of guessing which service is responsible, we can see exactly which span ate the four hundred milliseconds.”
Trace context propagation — passing a trace’s identifying information (trace ID, span ID) between services via headers, so spans created in different services are linked into one coherent trace. “This service isn’t propagating trace context to the downstream call — that’s why its spans show up as a disconnected, separate trace instead of part of the original one.”
Sampling — the practice of only recording a subset of traces (rather than every single request) to control the storage and performance overhead of tracing at scale. “We’re sampling at one percent, which is why this specific slow request from the bug report isn’t in Jaeger — bump the rate temporarily, or add tail-based sampling that always keeps errors.”
Critical path — the sequence of spans within a trace that directly determines total request latency, as opposed to spans that ran in parallel and didn’t add to the overall time. “These two spans ran in parallel, so neither is on the critical path — the actual bottleneck is the sequential span waiting on both of them to finish before it can start.”
Common Phrases
- “Can you add a span around that specific call so we can isolate where the time is actually going?”
- “Is this a single trace, or did context propagation break somewhere and split it into two?”
- “Are we sampling this at a high enough rate to actually catch the request in the bug report?”
- “Is this span on the critical path, or did it run in parallel with something else?”
Example Sentences
Debugging a slow request: “The trace shows this one span taking three hundred milliseconds by itself — that’s the actual bottleneck, everything else in the request is comparatively fast.”
Diagnosing a broken trace: “This service’s spans aren’t showing up under the parent trace — check whether it’s actually propagating the trace context header to the downstream call.”
Discussing sampling strategy: “Fixed-rate sampling at one percent means we’re missing most errors, since errors are rare by definition — switch to tail-based sampling so we always keep traces that ended in a failure.”
Professional Tips
- Encourage developers to add spans around specific suspect operations, not just at the service boundary — coarse spans hide exactly the detail an incident needs.
- Diagnose disconnected or partial traces by checking trace context propagation first — a missing header at a service boundary is the most common cause.
- Push for smarter sampling strategies (like tail-based sampling that always keeps errors and slow requests) over naive fixed-rate sampling, which tends to miss the traces you actually care about.
- Teach teams to identify the critical path in a trace before optimizing — spans that ran in parallel aren’t worth the same optimization effort as the sequential bottleneck.
Practice Exercise
- Explain to a teammate why a coarse span at the service boundary hides the detail needed to debug an incident.
- Describe what a broken trace context propagation looks like in Jaeger and how to spot it.
- Write a sentence proposing tail-based sampling instead of a naive fixed sampling rate.