How to Write a Post-Mortem Report in English

Learn the structure, language, and blameless tone for writing a post-mortem report in English — incident timeline, root cause, contributing factors, and action items.

A post-mortem (also called a post-incident review or PIR) is the written record of what happened during an outage or incident, why it happened, and what the team will do to prevent it from happening again. Writing a clear, blameless post-mortem in English is a professional skill that marks you as a mature engineer and a reliable team member.

This guide covers the structure, the key phrases, and — most importantly — the blameless tone that makes post-mortems effective rather than political.


What Is a Blameless Post-Mortem?

The phrase blameless post-mortem was popularised by Google’s Site Reliability Engineering practices and has become standard in modern engineering organisations. The core idea is that incidents are almost always caused by systemic failures — unclear runbooks, insufficient monitoring, missing guardrails — not by individual negligence.

A blameless post-mortem focuses on the system, not the person. Instead of writing “The engineer accidentally deleted the production database,” you write “The deletion command lacked a safety prompt and was executed against the wrong environment because the environment names in the CLI were visually similar.”

This distinction matters both ethically and practically. Blame prevents honest reporting. Without honest reporting, the real root causes remain unfixed.


Structure of a Post-Mortem Report

1. Summary

A two-to-four sentence overview of the incident.

Key phrases:

  • “On [date], [service] experienced an outage lasting [duration], affecting [scope of impact].”
  • “The root cause was [brief description]. The incident was resolved at [time] by [action taken].”

Example:

“On 2026-06-18, the checkout service experienced a 47-minute outage, affecting approximately 1,200 users. The root cause was a misconfigured database connection pool following a routine deployment. The incident was resolved by reverting the deployment and restarting the affected services.”

2. Impact

Quantify the impact clearly.

  • “Services affected: Checkout API, Order Confirmation emails.”
  • “Users affected: approximately 1,200 — all customers in the EU region.”
  • “Revenue impact: estimated £4,200 in delayed transactions (all recovered post-incident).”
  • “Duration: 14:22 UTC to 15:09 UTC — 47 minutes.”

3. Timeline

The timeline is the factual spine of the post-mortem. Write it in chronological order with precise timestamps. Use past simple tense consistently.

Key timeline phrases:

  • “14:22 UTC — The deployment of v3.7.2 completed successfully.”
  • “14:25 UTC — Error rate on the checkout endpoint began rising above the 1% threshold.”
  • “14:31 UTC — PagerDuty alert fired. On-call engineer acknowledged the page.”
  • “14:38 UTC — The team identified the misconfigured connection pool as the likely cause.”
  • “14:55 UTC — Rollback initiated.”
  • “15:09 UTC — Service restored to full health. Error rate returned to baseline.”
  • “15:22 UTC — All-clear issued. Monitoring continued for 30 minutes.”

Write the timeline as a neutral, factual record. Do not editorialise in the timeline section.

4. Root Cause

Explain what actually caused the incident. Apply the Five Whys technique: keep asking “why” until you reach a systemic cause.

Key phrases:

  • “The root cause was…”
  • “The immediate trigger was [X], but the underlying cause was [Y].”
  • “Investigation revealed that…”
  • “The failure occurred because [condition A] and [condition B] were both true simultaneously.”

Example:

“The root cause was that the database connection pool configuration was set using an environment variable that defaulted to 5 connections in all environments, including production, where 50 connections are required. The deployment pipeline did not validate this value before release, and no alert existed for the ‘pool exhausted’ error condition.”

5. Contributing Factors

These are the systemic conditions that made the incident possible or made it worse. This is where blameless language is most critical.

Blameless language patterns:

  • “The runbook for this deployment did not include a step to verify the connection pool configuration.”
  • “There was no automated check for this configuration value in the deployment pipeline.”
  • “The monitoring dashboard did not surface connection pool metrics, delaying detection by approximately 6 minutes.”
  • “The staging environment uses a different database tier with different connection limits, so the issue was not caught in testing.”

Notice: no sentences begin with “The engineer failed to…” or “The developer forgot to…”. The contributing factors describe systemic gaps, not individual errors.

6. What Went Well

Post-mortems should also capture what worked. This reinforces good practices.

  • “The on-call engineer responded within 4 minutes of the alert firing.”
  • “The rollback procedure was well-documented and executed cleanly.”
  • “The incident channel was created immediately, enabling parallel coordination.”

7. Action Items

Action items are the most important section of the post-mortem. Each item must be specific, assigned, and time-boxed.

Weak action item: “Improve monitoring.”

Strong action item: “Add a PagerDuty alert for database connection pool utilisation above 80%. Owner: @devops-team. Due: 2026-06-30.”

Template for each action item:

  • What: The specific change to be made
  • Why: Which contributing factor this addresses
  • Owner: Named individual or team
  • Due date: A specific date, not “soon” or “next sprint”

Tone and Language

Use Past Simple for Events

The post-mortem describes what happened. Use past simple consistently: “The alert fired,” “The engineer investigated,” “The team decided to roll back.”

Use Passive Voice Sparingly

Passive voice hides agency, which can feel evasive. “The database was deleted” is weaker than “The deletion command ran against the production database.”

Avoid Hedging in the Root Cause Section

The root cause section should be definitive: “The root cause was X.” Avoid “The root cause may possibly have been…” unless genuinely uncertain, in which case: “We believe the root cause was X, though this is not fully confirmed. Further investigation is in progress.”


Key Vocabulary

  • Post-mortem — a written analysis of an incident after it is resolved
  • Blameless — focused on systemic causes rather than individual fault
  • Root cause — the underlying reason an incident occurred
  • Contributing factor — a systemic condition that made the incident possible or worse
  • MTTR — Mean Time to Recover; the average time from incident detection to resolution
  • MTTD — Mean Time to Detect; the average time from incident start to alert firing
  • Action item — a specific task assigned to a named owner with a due date
  • Five Whys — a technique of repeatedly asking “why” to reach the root cause
  • All-clear — the signal that an incident is fully resolved

Writing a good post-mortem is an act of organisational generosity. You are giving your team the truth about what happened so they can build something more reliable. Done well, it is one of the most valuable documents an engineering team can produce.