DevOps Engineer English Essentials

55 terms and 20 phrases DevOps and SRE engineers use under pressure — across CI/CD, containers, infrastructure, and the high-stakes English of incident updates and on-call escalation.

Last reviewed: 29 May 2026

CI/CD

pipeline: The automated sequence of steps that builds, tests and ships your code.
CI: Continuous Integration — merging and testing changes frequently and automatically.
CD: Continuous Delivery/Deployment — automatically shipping changes to production.
build: Compiling and packaging the code into a runnable form.
artifact: The output of a build (a binary, an image, a zip) stored for later deploy.
deploy: Releasing a new version of the software to an environment.
rollback: Reverting to the previous known-good version after a bad deploy.
canary: Releasing to a small slice of traffic first to catch problems early.
blue-green: Running two identical environments and switching traffic between them.
feature flag: A toggle that turns a feature on or off without redeploying.
staging: A production-like environment used to test before going live.
smoke test: A quick check after deploy that the basics still work.
gate: A required check (tests, approval) that must pass before the pipeline proceeds.
runner / agent: The machine that executes pipeline jobs.

Containers & orchestration

container: A lightweight, isolated package of an app and its dependencies.
image: The immutable template a container is started from.
registry: A store for container images (Docker Hub, ECR, GHCR).
pod: The smallest deployable unit in Kubernetes — one or more containers together.
node: A worker machine in a cluster that runs pods.
cluster: A set of nodes managed together by an orchestrator like Kubernetes.
orchestration: Automatically scheduling, scaling and healing containers.
autoscaling: Adding or removing instances automatically based on load.
horizontal scaling: Adding more instances; vertical scaling means making each one bigger.
service mesh: A layer that manages traffic, security and observability between services.
ingress: The rules that route external traffic into a cluster.
sidecar: A helper container running alongside the main one in the same pod.
liveness probe: A health check that restarts a container if it stops responding.

Infrastructure

provision: To create and configure infrastructure (servers, networks, databases).
IaC: Infrastructure as Code — defining infra in version-controlled files.
state: IaC tools’ record of what infrastructure currently exists.
drift: When real infrastructure no longer matches what the code declares.
idempotent: An operation you can run repeatedly with the same end result.
immutable infrastructure: Replacing servers instead of changing them in place.
secret: A sensitive value (password, token) stored and injected securely.
environment: A named deployment target — dev, staging, production.
load balancer: A component that distributes traffic across instances.
reverse proxy: A server that forwards client requests to backend services.
DNS: The system that maps domain names to IP addresses.
TLS certificate: The credential that enables encrypted HTTPS connections.

Reliability & on-call

SLO: Service Level Objective — the reliability target you commit to internally.
SLA: Service Level Agreement — the reliability promise made to customers.
SLI: Service Level Indicator — the actual metric you measure (e.g. % of fast requests).
error budget: The allowed amount of unreliability before you must stop shipping risky changes.
incident: An unplanned disruption to a service that needs a response.
severity (SEV): How serious an incident is — SEV1 is critical, SEV3 is minor.
on-call: Being responsible for responding to alerts during a shift.
alert: An automated notification that something is wrong.
pager / paging: Being notified urgently, often outside hours, to handle an incident.
runbook: A step-by-step guide for handling a known operational task or incident.
postmortem: A blameless write-up after an incident explaining cause and fixes.
MTTR: Mean Time To Recovery — average time to restore service after failure.
observability: Understanding system state from logs, metrics and traces.
mitigation: A quick action that reduces impact before the root cause is fixed.
root cause: The underlying reason an incident happened, not just the symptom.
toil: Repetitive manual work that should be automated away.

Key phrases DevOps engineers use at work

We’re seeing elevated error rates in production — opening a SEV2 now.
Update: we’ve mitigated by rolling back to the previous release; investigating root cause.
Current status: customer-facing impact is contained, monitoring for the next 30 minutes.
I’m paging the database team — this is beyond what on-call can resolve alone.
We’ve burned through most of our error budget this month, so let’s hold risky deploys.
The canary is healthy after 10% for an hour — promoting to 100%.
Heads up: the deploy to staging is blocked by a failing smoke test. No action needed yet.
There’s config drift on the prod cluster — Terraform plan shows three unmanaged changes.
Let’s gate this behind a feature flag so we can roll it back instantly if needed.
The pipeline is red — the build step is failing on a missing dependency.
I’ll write up the postmortem; it’s blameless, so let’s focus on the systemic fix.
The pods are getting OOM-killed — we need to bump the memory limit.
Autoscaling kicked in during the spike and held latency within the SLO.
This alert is noisy and not actionable — let’s tune the threshold to cut the toil.
Escalating to the platform team; I’ve added the dashboard link and the relevant logs.
Confirmed recovery — MTTR was about 18 minutes. I’ll send the incident summary.
We should make this idempotent so re-running the deploy script is always safe.
The certificate expires Friday — I’ve scheduled the renewal and added an alert.
Yesterday: migrated the CI to the new runners. Today: writing the rollback runbook.
Quick question before I proceed: do we want blue-green or a rolling deploy for this service?

How to use this cheatsheet

Incident communication is where DevOps English matters most — clear, calm, factual updates build trust. Memorise the status-update and escalation phrases first; they follow a predictable structure (what’s happening, impact, what you’re doing, next update). Skim the terms to fill gaps, then practise in the linked exercises so the words are ready when the pager goes off.

On this page