How to Write a Blameless Post-Mortem in English

Learn the English phrases, structure, and vocabulary for writing effective blameless post-mortems: 5 Whys, contributing factors, timeline, and action items.

A post-mortem (also called an incident review or retrospective) is a structured document written after a significant incident to understand what happened and prevent recurrence. The defining feature of a blameless post-mortem is that it focuses on systems, processes, and circumstances — not on individual mistakes or failures.

Writing a clear, professional post-mortem in English is a core skill for engineers in SRE, DevOps, and engineering management roles.


The Blameless Principle

What “Blameless” Means in Practice

Blameless culture, popularised by Google and Etsy, rests on a key assumption: people acting in good faith made the best decisions they could with the information available at the time. The goal is systemic improvement, not punishment.

Language that assigns blame (avoid):

“John misconfigured the load balancer, causing the outage.” “The on-call engineer failed to respond quickly enough.”

Blameless alternatives:

“A misconfiguration in the load balancer configuration caused the outage.” “The alert threshold was set too high, which delayed detection by approximately eight minutes.”

Notice how blameless language focuses on the system, the process, and the conditions — not on individuals.


Post-Mortem Structure

1. Summary

Open with a brief paragraph that answers: what happened, when, and what was the user impact?

“On 12 June 2026 between 14:32 and 16:07 UTC, the checkout service was unavailable to approximately 40% of users in the EU region. Orders placed during this window were not processed. The root cause was a memory leak introduced in release v2.14.1, deployed at 14:20 UTC.”

2. Timeline

The timeline is a chronological list of events. Use past simple and precise times.

  • 14:20 UTC — Release v2.14.1 deployed to production.
  • 14:32 UTC — Error rate on the checkout service exceeded 5%; PagerDuty alert fired.
  • 14:38 UTC — On-call engineer acknowledged the alert and began investigation.
  • 15:10 UTC — Root cause identified as memory exhaustion in the payment processor client.
  • 16:07 UTC — Hotfix deployed; error rate returned to baseline.

Useful timeline phrases:

  • “At approximately…”
  • “Shortly after…”
  • “Approximately N minutes later…”
  • “The alert fired at…”
  • “The issue was first observed at…“

3. Root Cause Analysis

This is the analytical core of the document. The most common tool is the 5 Whys technique.


The 5 Whys in English

The 5 Whys is a structured questioning technique that traces a problem back to its systemic root cause by asking “Why?” repeatedly.

Example:

  • Why did the checkout service fail? → Because it ran out of memory.
  • Why did it run out of memory? → Because the payment client was not releasing connections after use.
  • Why was the connection leak not caught before deployment? → Because our integration tests did not run under sustained load.
  • Why did our integration tests not cover sustained load? → Because we have no load testing stage in our deployment pipeline.
  • Why is there no load testing stage? → Because it was deprioritised when the pipeline was set up eighteen months ago.

The root cause is not the connection leak itself — it is the absence of load testing in the deployment pipeline.

Useful phrases for 5 Whys:

  • “Tracing the chain of causation…”
  • “This in turn was caused by…”
  • “The underlying factor was…”
  • “This points to a gap in our…”

Contributing Factors

Identifying Contributing Factors

A single root cause rarely tells the full story. Contributing factors are conditions that made the incident worse or increased its likelihood.

“The following contributing factors amplified the impact of the incident:”

  • “The lack of a memory usage alert meant the issue was not detected until the service became unresponsive.”
  • “The hotfix deployment process required manual approval, adding approximately twenty minutes to the resolution time.”
  • “Runbook documentation for this service had not been updated since Q3 2025, which slowed the initial investigation.”

Language for Contributing Factors

  • “A contributing factor was…”
  • “This was exacerbated by…”
  • “The impact was amplified by…”
  • “A secondary factor was the absence of…”

What Went Well

Blameless post-mortems also document what the team did well. This reinforces good practices and provides balance.

“The on-call engineer correctly identified the affected service within six minutes of the alert firing.” “The incident channel was opened promptly and stakeholders were notified within ten minutes of detection.” “The automated rollback tooling was available and functional, though it was not used in this instance.”


Action Items

Writing Effective Action Items

Action items must be specific, assignable, and time-bounded. Vague action items are rarely completed.

Weak action item:

“Improve monitoring.”

Strong action item:

“Add a memory usage alert for the payment processor client that fires when RSS exceeds 80% of the container limit. Owner: Platform team. Target: 2026-06-28.”

Action Item Language

  • “Add an alert for…”
  • “Update the runbook to include…”
  • “Introduce a load testing stage to the deployment pipeline before…”
  • “Review and update documentation for…”
  • “Conduct a review of all services that use…”

Practical Phrases for Post-Mortems

  • “This incident was caused by a combination of…”
  • “The detection time was longer than expected because…”
  • “We have identified the following action items to prevent recurrence…”
  • “No single point of failure caused this incident — rather, several contributing factors aligned.”
  • “The blameless framing of this review is intentional: our goal is to improve the system, not to assess individual performance.”

A well-written blameless post-mortem builds trust within a team and creates a shared understanding of complex systems. Using precise, objective English — focused on systems rather than people — is the foundation of that trust.