Practice the vocabulary of distributed tracing: spans, traces, context propagation, and reading trace waterfalls in Jaeger, Zipkin, or Tempo.
0 / 8 completed
1 / 8
What is a 'trace' in distributed tracing?
A trace represents one request's end-to-end journey. If a user request hits Service A → Service B → Database → Cache → Service C, the trace captures all of these steps with their timing, so you can identify where latency occurs.
2 / 8
What is a 'span' in the context of distributed tracing?
Spans are the building blocks of traces. A trace is a tree of spans: the root span (initial service call) plus child spans (downstream calls). Each span includes name, timing, status, and attributes (HTTP method, DB query, etc.).
3 / 8
What is 'context propagation' in distributed tracing?
Without context propagation, each service's spans are isolated — you cannot link them into a trace. Tracing libraries automatically inject trace context into HTTP headers (W3C TraceContext standard: traceparent, tracestate) and extract it on the receiving end.
4 / 8
In a trace waterfall view, what does a span appearing under another span indicate?
The trace waterfall (or flame graph) shows parent-child relationships. A child span under a parent means the parent initiated that work. The horizontal width shows duration. A very wide child span indicates where latency is being spent.
5 / 8
What does 'sampling' mean in distributed tracing?
100% trace capture is expensive — high-traffic services generate millions of traces. Head-based sampling decides at trace start (e.g., record 1% randomly); tail-based sampling captures all traces that have errors or high latency, regardless of sampling rate.
6 / 8
What does 'critical path' mean when analyzing a distributed trace?
In a trace, some spans run in parallel (do not add to total time) and some are sequential (critical path). Optimizing a non-critical parallel span has no effect on total latency. Identifying and optimizing the critical path is the key to reducing end-to-end response time.
7 / 8
What is an 'exemplar' in the context of metrics and tracing?
Exemplars bridge the gap between metrics and traces. When you see a P99 latency spike in Grafana, an exemplar lets you click through to see the actual slow trace — dramatically reducing the time to diagnose latency regressions.
8 / 8
How would you describe a 'fan-out' pattern in a distributed trace to a colleague?
Fan-out is when one service calls N downstream services in parallel (e.g., aggregating data from 5 microservices). The trace shows this as sibling spans. The total duration is max(child durations), not their sum — parallel calls are efficient until one is slow.