Practice incident severity level language: SEV-1 through SEV-4 definitions, declaring and downgrading severity, impact assessment, customer-impacting vs internal-only classification.
0 / 5 completed
1 / 5
An on-call engineer pages the incident commander with the message: "Declaring SEV-1 — checkout is down for all customers." What does declaring SEV-1 typically mean?
Declaring SEV-1 triggers a specific protocol: IC assumes command, war room opens, comms lead drafts status page update, executive sponsor is notified, engineering is mobilised. The word 'declaring' is deliberate — it is an active, formal act, not a passive observation. Different companies define SEV-1 differently, but common criteria: 100% of customers impacted, core revenue-generating functionality unavailable, or potential for significant data loss.
2 / 5
What is the typical distinction between a SEV-2 and SEV-3 incident?
SEV definitions vary by company but the general pattern: SEV-1 (critical, total outage), SEV-2 (major impact, significant degradation), SEV-3 (partial impact, workarounds available), SEV-4 (minor issue, cosmetic or low-usage). The distinction drives response: SEV-2 gets a dedicated IC and war room; SEV-3 might be handled by the on-call engineer with regular status updates. Mis-classifying severity (calling a SEV-1 a SEV-3) is a common failure mode that delays appropriate response.
3 / 5
An incident commander says on the bridge: "Based on the new information, I'm downgrading this to SEV-2." When is downgrading severity appropriate?
Downgrading severity should be based on facts, not wishful thinking or social pressure. Valid reasons to downgrade: customer impact is less than initially thought, a mitigation is in place that restores most functionality, or the affected component is non-critical. Downgrading changes the response protocol — fewer people on the bridge, longer update cadence, different executive involvement. It should be communicated clearly: 'Downgrading to SEV-2 because [specific reason]. IC remains active, next update in 30 minutes.'
4 / 5
An incident is classified as 'customer-impacting.' What does this classification trigger compared to an 'internal-only' incident?
The customer-impacting vs internal-only classification drives the communications protocol. Customer-impacting: status page must be updated within a defined SLA (often 15-30 minutes), customer-facing support must be briefed, account managers may need to notify enterprise customers proactively, and the incident post-mortem must include customer communication review. Internal-only: no external communication required, but internal SLAs still apply and a post-mortem is still conducted.
5 / 5
A post-incident review notes: "Impact assessment was delayed by 20 minutes because the team lacked tooling to identify affected customer percentage." Why is rapid impact assessment critical in incident response?
Impact assessment — 'how many customers are affected, and how?' — is the first priority after detecting an incident. It feeds severity classification, which feeds everything else. Teams invest in impact assessment tooling: dashboards showing affected users in real time, automated alerting with customer impact context, and runbooks with impact query templates. Common impact metrics: % of requests erroring, number of affected accounts, revenue impact per minute of downtime, geographic scope.