How to Discuss a CI/CD Pipeline Failure in English
Learn the English phrasing for explaining a CI/CD pipeline failure, from distinguishing failure types to communicating impact to the team.
“The pipeline is broken” tells a team almost nothing actionable — it could be a real test failure, flaky infrastructure, a bad config change, or a downstream dependency outage, and each needs a completely different response. This guide covers the phrasing to be specific.
Key Vocabulary
Build failure — the pipeline failed during the compilation or packaging stage, before any tests ran, typically indicating a syntax error, a missing dependency, or a broken import. “This is a build failure, not a test failure — the code doesn’t even compile because of a missing import, so none of the actual tests got a chance to run yet.”
Test failure — the pipeline built successfully but one or more tests failed, indicating either a real regression in the code or a flaky/broken test, which need to be distinguished before deciding how to respond. “This is a genuine test failure, not flakiness — it fails consistently, and the assertion is checking behavior that the PR actually changed. This needs a real fix before merging.”
Infrastructure failure — the pipeline failed for reasons unrelated to the code being tested, such as a runner running out of disk space, a network timeout reaching an external service, or the CI provider having an outage. “That’s an infrastructure failure, not our code — the runner ran out of disk space mid-build. Retriggering the pipeline should fix it; there’s nothing in our PR to investigate.”
Pipeline stage — a discrete phase in a CI/CD pipeline (lint, build, test, deploy) that must typically pass before the next stage runs, useful for precisely locating where in the process a failure occurred. “It failed at the deploy stage, not the test stage — everything passed up through tests, but the deploy step couldn’t authenticate against the target environment.”
Common Phrases
- “Is this a build failure, a test failure, or an infrastructure issue?”
- “Which pipeline stage is this actually failing at?”
- “Is this a real regression, or is this test known to be flaky?”
- “Do we need code changes here, or is a retrigger enough?”
- “Is this failure specific to this PR, or is it happening on main too?”
Example Sentences
Triaging a failure in a PR comment: “This is failing at the build stage, not the test stage — looks like a missing dependency in the lockfile after the merge from main. Should be a quick fix, not an actual logic problem in the new code.”
Escalating an infrastructure issue to the team: “Heads up — the last three pipeline runs across multiple unrelated PRs have all failed at the same infrastructure step with a timeout reaching the package registry. This looks like an outage on their end, not anything in our code. I’ll post an update once it’s resolved.”
Explaining a real regression clearly: “This is a genuine test failure caused by the change in this PR — the test for the discount calculation is failing because the new rounding logic changes the expected output by a cent in edge cases. Needs an actual fix, not a retrigger.”
Professional Tips
- Classify the failure type immediately in any status update — build failure, test failure, or infrastructure failure — since each implies a completely different next step and audience.
- Name the specific pipeline stage where a failure occurred rather than saying “the pipeline failed” — it saves a teammate from re-running the whole thing just to find out where it actually broke.
- Distinguish a genuine regression from flakiness explicitly before deciding whether to merge — treating a real test failure as flaky (and retriggering until it passes) is how regressions slip through.
- When an infrastructure failure affects multiple unrelated PRs, escalate it as a shared issue rather than letting each author debug it independently — it saves the whole team redundant investigation time.
Practice Exercise
- Write a sentence distinguishing a build failure from a test failure.
- Explain how you’d communicate an infrastructure-caused pipeline failure to your team.
- Describe the difference between a genuine test regression and a flaky test failure, and why the distinction matters before merging.