Advanced 6 topic areas 64+ exercises

SRE / Platform Engineer

SREs are communication hubs during incidents, reliability reviews, and on-call handoffs. This path covers the precise language of error budgets, incident timelines, post-mortem facilitation, and reliability reporting.

Start first exercise → Browse all exercises

Topics covered

SLO/SLI/SLA
Error budgets
Incident response
Post-mortem writing
Runbooks
Chaos engineering

Vocabulary spotlight

4 terms every SRE / Platform Engineer should know in English:

error budget n.

The maximum amount of unreliability permitted by an SLO over a given period

"We've burned 70% of this quarter's error budget after Monday's incident."

SLO n.

Service Level Objective — a target value for a reliability metric such as availability

"Our SLO is 99.9% availability, measured as a rolling 30-day window."

blameless post-mortem n.

An incident retrospective focused on systemic causes, not individual fault

"The blameless post-mortem revealed five contributing factors."

chaos engineering n.

Intentionally injecting faults into a system to verify resilience

"We use chaos engineering to validate that our circuit breakers actually work."

Open full glossary →

📚 Vocabulary Reference

Key terms organised by category for SRE / Platform Engineers:

Reliability Metrics

SLOSLISLAerror budgetburn ratealert thresholdMTTRMTBFavailabilityuptimefive nines

Incident Management

incidentSEV-1SEV-2on-callpageescalationincident commanderrespondermitigationresolutionblameless post-mortem

Observability

metriclogtracespandistributed tracingcardinalityp95p99dashboardalertrunbookplaybook

Reliability Concepts

toilautomationcapacity planningload sheddinggraceful degradationcircuit breakerretry with backoffchaos engineeringfault injection

Infrastructure

KubernetesPrometheusGrafanaPagerDutyDatadogclusternode poolresource limittainttoleration

Deployment Safety

canaryblue-greenrolling updaterollbackchange freezedeployment gatesmoke testreadiness probeliveness probe

Study full vocabulary modules →

Recommended exercises

SRE & Reliability Vocabulary 30 exercises

Vocabulary

Performance & Monitoring Collocations 5 exercises

Vocabulary

Writing Post-Mortems & Incident Reports 3 exercises

Writing

Write a Runbook Entry for a Common Alert 3 exercises

Writing

Write an On-Call Handover Note 3 exercises

Writing

Post-Incident Email to Stakeholders 4 exercises

Writing

Reading Error Logs & Kubernetes Events 3 exercises

Reading

Conduct an Incident Call — Assign Roles & Declare Resolved 8 exercises

Speaking

SRE Engineer Interview Questions 5 exercises

Interview

Real-world scenarios you'll practise

Writing a blameless post-mortem after an SEV-1 incident
Presenting error budget burn rate to engineering leadership
Facilitating a live incident call with multiple teams
Drafting an SLO proposal for a new service
Writing a SEV-1 customer-facing status page update — honest, calm, no jargon, regular cadence
Explaining toil reduction to management — justifying automation investment in business terms
Writing a capacity planning proposal — current usage, projections, recommended provisioning, cost estimate

🎯 Interview questions specific to this role

Practise answering these questions out loud — or in writing. Each question targets a real interviewer concern for SRE / Platform Engineers.

How do you define and enforce SLOs for a new service?
Walk me through how you would handle a SEV-1 incident from alert to post-mortem.
How do you balance feature velocity with reliability work?
What is an error budget and how have you used one in practice?
How do you communicate reliability metrics to non-technical stakeholders?

Practice all interview exercises →

Reference glossaries for SRE / Platform Engineers

Deep-dive glossaries covering terminology specific to this role:

CLI Commands Reference Cloud Services Cheat Sheet HTTP Status Codes

Browse full IT glossary →

Frequently Asked Questions

What English skills do SRE / Platform Engineers most need to improve?+

SRE / Platform Engineers most commonly need to improve: technical vocabulary (the correct English terms for domain concepts), collocation accuracy (using the right verb for each action), written communication (bug reports, PR descriptions, technical docs), and spoken communication for standups, code reviews, and stakeholder meetings.

How long does the SRE / Platform Engineer learning path take?+

The SRE / Platform Engineer learning path contains 20–40 hours of material studied comprehensively. Most learners focus on the highest-priority modules first and return to the rest over time. Spending 30 minutes per day for 4–6 weeks produces noticeable improvement in workplace English.

What vocabulary should a SRE / Platform Engineer prioritise first?+

Start with the vocabulary that appears most in your daily work — terms you read in documentation, use in commit messages, and hear in meetings. The SRE / Platform Engineer path begins with the most frequent vocabulary clusters before moving to advanced communication patterns.

Are there interview exercises for SRE / Platform Engineer roles?+

Yes. The SRE / Platform Engineer path includes role-specific interview question modules with model answers and key phrases — the actual questions interviewers ask and the vocabulary needed to answer them fluently. There is also a dedicated Interview Practice hub for general interview skills.

Does this path include pronunciation help?+

Yes. The path links to pronunciation exercises for the technical terms most commonly mispronounced in this domain. The Pronunciation hub includes drills for acronyms, silent letters, word stress, and minimal pairs — all in IT context.

What are the most common English mistakes SRE / Platform Engineers make?+

The most common mistakes: incorrect collocations (using the wrong verb with a technical noun), false friends from L1, tense errors when narrating past incidents or walkthroughs, and using overly formal or overly casual register in written communication.

How do I improve my English for code reviews?+

Learn the standard code review collocations: approve a PR, request changes, leave a nit, address feedback, block a merge, resolve a conversation. Use hedging language for suggestions: "This might be cleaner as…", "Have you considered…?". The Collocations section includes a dedicated Code Review set.

Can I use this path alongside my daily work?+

Yes — the path is designed for working professionals. Each exercise set takes 10–15 minutes. The most effective approach is to study a vocabulary module before a meeting or task where you'll use that vocabulary, then practise immediately after. Context-linked practice produces much faster retention.

Is the content free?+

Yes, completely free. No registration required, no payment, no time limit. All vocabulary modules, exercises, glossary entries, and learning path guides are open access.

How do I track my progress through this path?+

Progress is tracked in your browser's local storage — completed exercise sets are marked with a checkmark when you return. No account is needed. You can bookmark specific modules and use the exercises overview to see which sets you've completed.

Topics covered

Vocabulary spotlight

📚 Vocabulary Reference

Reliability Metrics

Incident Management

Observability

Reliability Concepts

Infrastructure

Deployment Safety

Recommended exercises

Real-world scenarios you'll practise

🎯 Interview questions specific to this role

Recommended reading

Reference glossaries for SRE / Platform Engineers

Frequently Asked Questions