5 exercises on the language of test health — brittle tests and test smells, assertions, determinism, quarantining flaky tests, and false positives vs false negatives.
Key patterns
brittle / fragile test → breaks too easily
test smell → symptom of a deeper problem
assert / assertion → the test check
deterministic test → same result every run
false positive vs false negative
0 / 5 completed
1 / 5
Reviewing a fragile test suite, a senior engineer warns: "These tests are ___ — the smallest change to unrelated code makes dozens of them fail." Which adjective fits?
Brittle tests — tests that break too easily:
A brittle test fails in response to small, unrelated changes — it is too tightly coupled to implementation details (exact HTML, call order, internal structure) rather than behaviour. Brittle suites make refactoring painful because everything goes red for the wrong reasons.
"brittle test" / "the suite is brittle"
"tests are tightly coupled to the implementation"
"test against behaviour, not implementation"
"fragile test" — near-synonym
Contrast the distractors: a deterministic test always gives the same result (a good property); idempotent means running it repeatedly is safe; green just means passing. The word for a test that snaps at the slightest touch is brittle.
2 / 5
In a code review a reviewer comments: "This is a classic test ___ — the test has five unrelated assertions and three levels of nested setup. It signals a deeper design problem."
Test smell — a warning sign in test code:
A test smell (by analogy with "code smell") is a surface symptom in a test that suggests a deeper problem — not necessarily a bug, but a sign the test or the design under test needs attention.
"a test smell" / "this smells"
"a code smell" — the general term for the production-code equivalent
Common test smells:
Assertion roulette — many assertions, unclear which failed
Mystery guest — the test depends on hidden external data
Eager test — one test verifying too many things
Fragile / brittle setup tightly coupled to internals
Why not the others? A test case is a single scenario, a test runner executes tests, and a fixture is starting data. A "smell" specifically means a symptom worth investigating.
3 / 5
Explaining a check inside a test, a developer says: "After calling the function, we ___ that the result equals 42 — if the ___ fails, the test fails." Which pair fits?
Assert / assertion — the core test-check vocabulary:
An assertion is the statement in a test that checks an expected condition; to assert is the verb for making that check. If the asserted condition is false, the test fails.
verb: "assert that x equals y" / "assert the response is 200"
noun: "the assertion failed" / "assertion error"
"assert on the output" / "make an assertion"
library spellings: assertEqual (xUnit), expect(x).toBe(y) (Jest), assert x == y (pytest)
Note: some frameworks use expect as the spelling (expect(...)), and that is fine — but the general noun for the check itself is an assertion, and "if the assertion fails" is the natural phrasing. "Assume", "declare" and "promise" mean other things in programming.
4 / 5
A test passes locally but fails in CI for no clear reason. A teammate suggests a triage step: "Let’s ___ that test — move it out of the gating suite so it stops blocking everyone while we investigate."
Quarantine a test — isolating flakiness without losing it:
To quarantine a test is to move a known-flaky test out of the gating (merge-blocking) suite so it stops causing false failures, while keeping it around to investigate — rather than deleting it and losing the coverage.
"quarantine the flaky test" / "move it to the quarantine suite"
"mark it as flaky" / "tag it @flaky"
"skip it temporarily" / "mark it pending"
"stabilise / fix it, then bring it back"
Why not delete? Deleting removes a real test; quarantining preserves it for triage. The distractors are wrong moves here: you don’t deploy a test, assert isn’t triage, and refactoring doesn’t stop it blocking the pipeline. The precise term for sidelining a flaky test is quarantine.
5 / 5
A QA lead distinguishes two failure modes: "A ___ ___ is when a test reports a bug that isn’t real; a ___ ___ is worse — the test passes but a real bug slipped through." Which pair fills the blanks?
False positive vs false negative — the two ways a test misleads:
In testing, these terms describe misleading results (convention: a "positive" = the test signals a problem / fails):
False positive — the test fails but nothing is actually wrong (a false alarm). Flaky tests and brittle assertions cause these; they erode trust ("the boy who cried wolf").
False negative — the test passes but a real bug exists — the defect slips through undetected. More dangerous, because it gives false confidence.
Collocations: "this is a false positive", "we’re getting false negatives — the test isn’t actually checking anything", "reduce false positives".
Why not the others? "True positive/negative" are the correct outcomes; red/green describe build state; smoke/regression are test types. The pair for misleading results is false positive / false negative — closely tied to the goal of a deterministic test that gives the same, trustworthy result every run and never leaves a coverage gap.