How to Discuss Flaky Tests in English

A practical English guide for talking about flaky tests — how to describe intermittent failures, propose fixes, and push back on 'just re-run it' culture.

Flaky tests — tests that pass and fail inconsistently without any code change — are a common source of frustration on engineering teams, and discussing them clearly in English matters more than it might seem. Vague descriptions like “it’s just flaky” often mask real bugs and make it harder to justify the time needed to fix them. This guide gives you the vocabulary to describe, investigate, and push back on flaky test problems in a professional way.

Key Vocabulary

Flaky test — a test that produces different results (pass or fail) across runs without any change to the code or test itself. “The checkout test is flaky — it fails about one run in ten, seemingly at random.”

Intermittent failure — a failure that occurs occasionally rather than consistently, making it harder to reproduce and diagnose. “We’ve seen an intermittent failure in the payment integration test, but we haven’t pinned down the trigger yet.”

Race condition — a bug where the outcome depends on the timing or order of concurrent operations, a common cause of flakiness. “The flakiness turned out to be a race condition between the test setup finishing and the background job starting.”

Test isolation — the principle that each test should run independently, without depending on shared state left over from another test. “We lost test isolation when two tests started writing to the same database row — running them in parallel caused failures.”

Quarantine (a test) — temporarily excluding a known-flaky test from the required CI suite while it’s investigated, so it doesn’t block unrelated merges. “We quarantined the flaky test and filed a ticket, so it stops blocking every unrelated pull request while we investigate.”

Root cause — the underlying reason a test fails intermittently, as opposed to the surface symptom. “The root cause wasn’t the test itself — it was a shared test database that hadn’t fully reset between runs.”

Retry logic (in CI) — automatically re-running a failed test once or twice before marking it as failed, often used as a stopgap for flakiness. “We added retry logic as a short-term measure, but it’s a workaround, not a fix — the underlying race condition is still there.”

Signal (test signal) — the degree to which a test’s pass/fail result reliably indicates a real problem, as opposed to noise. “Every flaky test we leave unfixed erodes trust in the whole suite — people stop trusting the signal and start ignoring failures.”

Describing a Flaky Test to Your Team

  • “This test fails intermittently — roughly 5% of runs — and I haven’t found a clear trigger yet.”
  • “It looks timing-related. The test occasionally runs before the async job it depends on has completed.”
  • “I don’t think this is a real regression. The same test failed on main before this change, unrelated to what we’re merging.”

Proposing a Fix or Investigation Plan

  • “I’d like to quarantine this test for now and open a ticket, rather than let it keep blocking merges for the whole team.”
  • “Before we retry-and-move-on, can we spend an hour trying to reproduce it locally? I suspect it’s a race condition in the setup code.”
  • “I’ll add better logging around the failure point so the next occurrence gives us more to work with.”

Pushing Back on “Just Re-run It” Culture

It’s common for teams to develop a habit of re-running flaky tests without investigating. Here’s how to raise a concern about that constructively:

  • “I know re-running usually gets it green, but we’ve re-run this same test six times this week. Can we prioritise looking into it?”
  • “If we keep re-running instead of fixing, we’re training everyone to distrust red CI, which is worse for us long-term.”
  • “This is the third flaky test we’ve quarantined this month — I think it’s worth a short retro on why our test isolation keeps breaking.”

Professional Tips

  1. Quantify flakiness whenever possible. “Fails 1 in 10 runs” is more actionable than “sometimes fails.”
  2. Separate the symptom from the root cause in your description. “The test times out” is a symptom; “a race condition between setup and the async job” is closer to a root cause.
  3. Frame quarantining as a temporary, tracked decision. Without a ticket and an owner, quarantined tests are often forgotten permanently.

Practice Exercise

  1. Write a Slack message (3-4 sentences) describing a flaky test to your team, including how often it fails and what you suspect the cause is.
  2. Write a short proposal (4-5 sentences) to quarantine a flaky test, including why and what the follow-up plan is.
  3. Write a polite but firm message pushing back on a teammate who suggests just re-running a failing test without investigation.