Practice production debugging vocabulary: distributed tracing, non-intrusive profiling, audit log reconstruction, canary diagnostics, and remote debugging constraints.
0 / 5 completed
1 / 5
An SRE says 'We can't attach a debugger to production.' Why is this typically true?
Interactive debuggers pause execution at breakpoints — pausing a production process would freeze request handling, causing immediate user impact. Production debugging relies on observability tools (traces, logs, metrics) that provide insight without stopping execution.
2 / 5
A senior engineer says 'The distributed trace shows where the request failed.' What does a distributed trace reveal?
A distributed trace follows a request across all the services it touches, recording the timing and outcome of each operation (service calls, database queries, external API calls). This makes it possible to pinpoint which service or operation caused a latency spike or failure in a complex distributed system.
3 / 5
Your production tooling includes 'a profiler that runs without pausing the process.' What type of profiler is this?
Sampling profilers periodically capture stack traces without pausing the process, introducing minimal overhead. This makes them safe for production. They provide CPU/memory profiles that show where time is being spent — helping diagnose performance issues without a reproduce-in-dev step.
4 / 5
During an incident, your team says 'The audit log reconstructs the event sequence.' What makes an audit log useful for production debugging?
Audit logs are timestamped records of significant events. During debugging, they let you replay what happened: which user action triggered what, what state the system was in at a given time, and what sequence of events preceded the failure — crucial when you can't reproduce the issue locally.
5 / 5
A post-incident report says 'The canary is capturing diagnostic data.' What is a canary in production debugging?
A diagnostic canary routes a small fraction of production traffic to a specially instrumented build — one with verbose logging, extra tracing, or profiling enabled. This captures real production data for debugging while limiting the performance impact and blast radius to a small user subset.