Root Cause Analysis
5 Whys, proximate vs. root cause, contributing factors, and blameless causation language
Root cause vocabulary
- Proximate cause — the immediate trigger: "timeout was misconfigured"
- Contributing factor — a condition that amplified the impact
- Root cause — the systemic gap: "no version control for infrastructure config"
- "Triggered by", "as a result of", "contributing to" — causation framing
- 5 Whys — ask "why?" iteratively until you reach a systemic process gap
Question 0 of 5
In root cause analysis, what is the difference between a proximate cause and a root cause?
Proximate = immediate trigger; Root = underlying systemic failure. Example:
- Proximate cause: "A misconfigured load balancer timeout caused connection queuing"
- Contributing cause: "The staging environment did not mirror the production timeout settings"
- Root cause: "Infrastructure configuration is not version-controlled, allowing environment drift to go undetected"
The "5 Whys" technique is used in root cause analysis. What does it involve?
Iterative "why?" until you reach a systemic root cause. Example 5 Whys:
- Why? — "Payment service returned 500 errors"
- Why? — "Database connection pool exhausted"
- Why? — "Number of DB connections increased after config change"
- Why? — "Config change was not reviewed by a DBA"
- Why? — "No DBA review step exists in the deployment checklist"
Which sentence uses the correct root cause framing vocabulary?
Trigger + contributing factor + root cause with systemic framing is correct. Key vocabulary:
- "Triggered by" — the immediate event that started the incident
- "Contributing factor" — a condition that made the trigger more severe or more likely
- "Root cause" — the systemic gap: "lack of a mandatory staging validation step"
- "Due to" and "as a result of" — used to show causation chains
A post-mortem states: "John pushed a bad config to prod." How should this be rewritten in blameless language?
Describe the technical failure and the systemic gap — not the person. Blameless rewrite structure:
- ❌ "John pushed bad config" — identifies and blames a person
- ✅ "A config with an incorrect timeout value was deployed" — describes the technical event
- ✅ "...not caught by pre-deployment validation" — identifies the process gap
- ✅ "...as staging lacked equivalent load" — explains why the gap existed
What is the difference between "due to" and "as a result of" in root cause writing?
"Due to" after "to be"; "as a result of" more freely — both express causation. Usage in technical writing:
- ✅ "The outage was due to a misconfigured timeout" — "was due to" (be + due to)
- ✅ "As a result of the misconfigured timeout, the service became unresponsive" — any position
- ❌ "Due to a misconfigured timeout, the service became unresponsive" — technically incorrect (but widely accepted in practice)