Writing About Failure Publicly
Transparent language, active voice, specific empathy, verifiable actions, and technical level
Public failure writing essentials
- Transparency: specific numbers + clear ownership — "unprecedented" and "may have" are deflection signals
- Active voice: "we did not roll back in time" + root cause context, not "the rollback was delayed"
- Specific empathy: name the actual impact users experienced — generic empathy feels hollow
- Verifiable actions: specific + measurable + scoped — not "improved monitoring"
- Technical level: plain-language cause for all readers + linked engineering deep-dive for technical audiences
Question 0 of 5
A company's public postmortem begins: "Due to unprecedented circumstances beyond our control, some users may have experienced issues with our service." What is wrong with this?
- ❌ "unprecedented circumstances" — almost nothing is truly unprecedented; this signals spin
- ❌ "beyond our control" — even if true, this deflects ownership of the response
- ❌ "may have experienced" — if users reported it, they experienced it; "may have" is deflection
- ❌ "some users" — be specific: "approximately 15% of users in the EU region" is honest
- ✅ "Between 14:35 and 17:42 UTC on 14 March, approximately 23% of our users were unable to complete checkout. We take full responsibility for this disruption. Here is what happened and what we have changed."
Which active voice rewrite of a passive voice sentence is correct in the context of a public postmortem?
- ❌ "The deployment was not rolled back in time" — no one is accountable; readers feel the blame is being hidden
- ✅ "We did not roll back the deployment in time" — clear ownership ("we")
- ✅ "our runbook did not specify a rollback trigger threshold" — explains the systemic gap, not a personal failure
- When the subject genuinely cannot be identified: "An anomalous request was received..." (the source is unknown)
- When the action is more important than the actor: "The service was restored at 17:42 UTC"
A company writes: "We understand the frustration this may have caused." What is missing from this empathy statement?
- ❌ "We understand the frustration this may have caused" — "may have caused" still deflects; "frustration" is vague
- ✅ Specific impact: "not being able to complete your purchase during one of our busiest shopping periods"
- ✅ Concrete consequence: "cost many of you time and money"
- ✅ Direct apology: "we're sorry" — not hedged with "if" ("we're sorry if this inconvenienced you" implies it might not have)
A public postmortem's "What we've done" section lists: "1. Improved monitoring. 2. Enhanced deployment processes. 3. Strengthened our infrastructure." What is the core problem?
- ❌ "Improved monitoring" — which metric? which system? by how much?
- ✅ "Added per-region error rate dashboards to Grafana with PagerDuty alerts at 2% error rate per region (previously only global alerts existed)"
- ❌ "Enhanced deployment processes" — what specifically changed?
- ✅ "Added a mandatory 10-minute canary phase (5% traffic) with automatic rollback if error rate exceeds 1% — this would have caught and rolled back the failing deployment automatically"
- Sceptical readers will assume vague actions are PR-speak
- Specific actions allow readers to verify: "they said they added canary deployment — can I see this in their status page or changelog?"
- Your own team needs to hold itself accountable — vague actions are never "done"
At what level of technical detail should a public-facing postmortem be written?
- Plain-language cause: "A configuration change to our load balancer incorrectly directed all checkout traffic to a single server rather than distributing it — that server became overloaded and began rejecting requests" — understandable without technical background
- Avoid: "A misconfigured weight parameter in our NGINX upstream configuration caused a 100:0 load imbalance across our checkout pool" — accurate but inaccessible to most customers
- Engineering deep-dive: a separate linked document or section for technical readers who want the specifics