DevOps English Vocabulary: CI/CD, Infrastructure-as-Code, and Incident Management

Master the English vocabulary DevOps engineers use every day — pipeline terminology, infrastructure-as-code language, on-call phrases, and incident management communication.

DevOps has its own dialect. When you join a team that operates pipelines, manages infrastructure through code, and responds to incidents at 2 a.m., you need to understand the words your colleagues use — and use them correctly yourself. Misusing terms like “rollback” vs “roll forward,” or confusing “deploy” with “release,” can cause expensive misunderstandings. This guide covers the vocabulary that DevOps engineers, SREs, and platform engineers use in meetings, runbooks, and Slack channels every day.


CI/CD Pipeline Terminology

Continuous Integration and Continuous Delivery are the backbone of modern software delivery. These terms appear in job descriptions, architecture discussions, and code review comments.

Pipeline — an automated sequence of steps that builds, tests, and deploys code. “The pipeline failed at the integration test stage — someone pushed a breaking change.”

Stage / Step — a discrete phase within a pipeline. Common stages include build, test, lint, security-scan, deploy-staging, deploy-production. “The deploy-prod stage requires manual approval before it runs.”

Trigger — the event that starts a pipeline run. Triggers include push events, pull request creation, merged PRs, scheduled cron jobs, or manual dispatches. “We added a trigger so the pipeline runs automatically on every push to the main branch.”

Artifact — a file or bundle produced by the build stage and passed to later stages. Docker images, compiled binaries, and test reports are all artifacts. “The pipeline stores build artifacts for 30 days so we can redeploy any previous version.”

Green build / Red build — informal terms for a passing (green) or failing (red) pipeline run. “We have a rule: never merge when the build is red.” “The build went green after I fixed the flaky test.”

Flaky test — a test that sometimes passes and sometimes fails without any code change. “We have three flaky tests in the suite that cause intermittent pipeline failures — we need to fix or quarantine them.”

Deployment frequency — how often code is deployed to production. A key DORA metric. “Our deployment frequency is 15 times per day — we deploy every merged PR automatically.”

Lead time for changes — the time from committing code to having it running in production. “Reducing lead time from three days to four hours was our main CI/CD goal this quarter.”


Infrastructure-as-Code Language

Infrastructure-as-Code (IaC) means defining servers, networks, and services in code rather than clicking through a UI. The vocabulary is technical but specific.

Provision — to create and configure infrastructure resources. “We provision the database cluster using Terraform on every environment creation.” Contrast with deprovision: to tear down resources.

State — in tools like Terraform, the stored record of what infrastructure currently exists, used to calculate what changes need to be made. “The Terraform state file drifted from reality because someone made a manual change in the AWS console.”

Drift — when the actual state of infrastructure diverges from what is defined in code. “We run drift detection nightly to catch any manual changes that bypass our IaC workflow.”

Idempotent — an operation that produces the same result regardless of how many times it runs. IaC tools are designed to be idempotent. “Ansible playbooks should be idempotent — running them twice should not change anything on the second run.”

Module — a reusable, self-contained unit of IaC configuration. “We have a shared VPC module that every team imports to ensure consistent network configuration.”

Plan / Apply — Terraform’s two-step workflow: plan shows what changes will be made; apply executes them. “Always review the plan output carefully before you apply — the plan shows you exactly what will be created, modified, or destroyed.”

Immutable infrastructure — the practice of replacing servers rather than modifying them in place. “We follow immutable infrastructure principles: instead of patching the running instance, we bake a new AMI and replace it.”


Incident Management Phrases

Incidents are a normal part of running production systems. The language used during and after incidents is formal, precise, and follows specific conventions.

Incident — an unplanned event that degrades or disrupts a service. Incidents are classified by severity: SEV1 (critical), SEV2 (major), SEV3 (minor). “We have a SEV1 incident — the payment service is returning 500 errors for all users.”

On-call — the rotation of engineers responsible for responding to alerts outside business hours. “I’m on-call this week, so my phone is on loud.” “Who is on-call for the platform team right now?”

Page / alert — a notification sent to the on-call engineer when a monitoring threshold is breached. “I got paged at 3 a.m. because latency exceeded the SLO threshold.”

Triage — the process of assessing and prioritising the scope and severity of an incident. “Once the alert fired, the team triaged the issue in five minutes and escalated to SEV1.”

Escalate — to involve more people or senior staff when an incident cannot be resolved quickly. “If you haven’t found the root cause within 30 minutes, escalate to the engineering lead.”

Mitigation — an action taken to reduce the impact of an incident, even if the root cause has not been fixed. “As a mitigation, we increased the connection pool size — the error rate dropped immediately.”

Rollback — reverting to a previous working version of the software or configuration. “We rolled back the deployment at 11:47 and the service recovered within two minutes.”

Postmortem / Post-incident review — a document written after an incident that explains what happened, why it happened, what the impact was, and what actions will prevent recurrence. Blameless postmortems focus on systems, not individuals. “The postmortem identified three contributing factors: a missing circuit breaker, inadequate alerting, and insufficient staging load testing.”

Root cause — the underlying reason an incident occurred, not just the immediate symptom. “The root cause was a memory leak introduced in the release two days before the incident.”


Observability and Monitoring Vocabulary

Observability — the ability to understand the internal state of a system from its external outputs (logs, metrics, traces). “We invested in observability tooling this quarter — we can now trace any request from the API gateway to the database.”

SLO (Service Level Objective) — an internal target for how reliable a service should be. “Our SLO for the checkout API is 99.9% availability and p99 latency under 300ms.”

SLA (Service Level Agreement) — a formal, contractual commitment to customers about service reliability. “Our SLA guarantees 99.5% uptime; breaches trigger customer credits.”

Error budget — the amount of downtime or errors a service is allowed to have while still meeting its SLO. “We’ve burned 80% of our error budget this month — no risky deployments until next month.”

MTTR (Mean Time to Recovery) — the average time from an incident starting to the service being restored. “Improving our runbooks reduced MTTR from 45 minutes to 12 minutes.”


Practical Exercises

Test your understanding of DevOps vocabulary with these activities:

  1. Pipeline review: Read the YAML configuration of a GitHub Actions or GitLab CI workflow. Identify every stage, trigger, and artifact. Write a one-paragraph summary in English describing what the pipeline does.

  2. Incident simulation: Read a public postmortem (Google, Cloudflare, and Stripe publish them). Identify the triage steps, mitigation actions, and root cause. Summarise the timeline in your own words.

  3. IaC vocabulary mapping: Open a Terraform or Ansible repository on GitHub. Find examples of modules, state references, and idempotent operations. Annotate each with the correct vocabulary term.

  4. Write a mini runbook: Choose a simple operational task (e.g., restarting a service, scaling up a deployment) and write the steps in English as if writing a runbook for a new team member. Use at least five vocabulary terms from this article.

Practice using these terms in context by visiting the DevOps & Cloud vocabulary exercises on Coders Lingo.