5 exercises — practice structuring strong English answers to QA interview questions: test taxonomy, flaky tests, test planning techniques, CD quality gates, and late-cycle defect handling.
How to structure QA interview answers
Test taxonomy questions: define each level precisely → name the key differentiator (e.g., real vs mocked dependencies) → mention the test pyramid
Flaky test questions: quarantine → diagnose root cause (timing, state, environment, data) → fix, don't just retry
Test planning questions: name design techniques (BVA, equivalence partitioning, decision tables) → include risk assessment and exit criteria
CD quality questions: shift-left → automated quality gates → feature flags → observability as post-deploy feedback
Defect handling questions: quantify business impact → present options with trade-offs → follow with a post-mortem improvement
0 / 5 completed
1 / 5
The interviewer asks: "What is the difference between unit tests, integration tests, and end-to-end tests?" Which answer demonstrates the clearest and most precise test taxonomy?
Option B is the strongest: it gives precise definitions with the key detail for each level (isolation + mocked dependencies for unit; real dependencies for integration; production-like environment + named tools for E2E), explains the trade-off between speed/cost and coverage, and names the test pyramid. The critical distinction for integration tests: they use real dependencies (real database, real API calls to adjacent services) — this is what separates them from unit tests where everything is mocked. Key vocabulary: Unit test — single function or class, dependencies mocked, runs in milliseconds. Integration test — multiple components + real dependencies, may need a test database or containerised services. End-to-end (E2E) test — full user flow, real browser or API client, slowest. Test pyramid — many unit, fewer integration, fewest E2E. Contrast with the test trophy (Kent C. Dodds) or test ice cream cone (anti-pattern with too many E2E). Named tools matter: Cypress, Playwright (E2E browser tests); Jest, Vitest, pytest (unit/integration); Supertest, Testing Library (integration). Options C and D are reasonable but less precise about what "real dependencies" means in integration testing.
2 / 5
The interviewer asks: "How do you handle flaky tests — tests that sometimes pass and sometimes fail for no obvious reason?" Which answer best demonstrates systematic QA thinking?
Option B is the strongest: it explains why flakiness is a problem (trust in CI), gives a precise three-step process (quarantine → diagnose → fix), names the four most common root causes with specifics, makes the critical point that retries are a mask not a fix, and treats flaky test rate as a tracked metric — which shows QA maturity. The four root causes of test flakiness are important to name in an interview: 1. Timing/async issues — fixed sleeps, missing awaits, race conditions. The fix: explicit waits, event-driven assertions, retry on condition (not blanket retry). 2. Test ordering dependencies — one test leaves shared state that breaks the next. The fix: proper setup/teardown in each test, test isolation. 3. Environment differences — different locale, timezone, available ports, resource constraints. The fix: containerise tests, use deterministic seeds. 4. Non-deterministic data — random IDs in assertions, floating-point comparisons, time-dependent logic. The fix: mock randomness, use time-frozen libraries, use approximate matchers. The "quarantine" step is a production QA practice: move the flaky test to a separate test suite that runs but doesn't block the pipeline, so the team can fix it without ignoring it. Options C and D are solid but less comprehensive on root causes and don't frame it as a metric/quality debt problem.
3 / 5
The interviewer asks: "Walk me through how you write a test plan for a new feature." Which answer best demonstrates structured QA planning?
Option B is the strongest: it names specific test design techniques (boundary value analysis, equivalence partitioning, decision tables), separates test levels, explains entry/exit criteria, includes risk assessment as a prioritisation mechanism, and covers the operational side (effort estimation, dependencies). Key test plan vocabulary and techniques: Test design techniques:Boundary value analysis — test the values at the boundaries of input ranges (if valid range is 1–100, test 0, 1, 100, 101). Equivalence partitioning — divide inputs into groups where all values in a group behave identically; test one from each group. Decision table — map all combinations of conditions to expected outputs (useful for complex business logic). Test levels in a plan: unit, integration, system/E2E, regression, smoke/sanity, UAT. Entry criteria — what must be true before testing begins (e.g., feature deployed to test env, test data loaded). Exit criteria — what signals testing is complete (e.g., all P1 cases passed, defect rate below threshold). Risk-based testing — allocate more test coverage to high-risk, high-impact areas. Options C and D are reasonable answers that would pass review, but B demonstrates test engineering depth that goes beyond calling yourself a "QA engineer who writes test cases."
4 / 5
The interviewer asks: "How do you ensure quality in a continuous deployment pipeline where code ships multiple times per day?" Which answer best demonstrates modern QA practices?
Option B is the strongest: it frames the core principle (quality gates replace manual QA), gives the complete multi-layer approach, and crucially includes feature flags and canary deployments — concepts that go beyond the testing pyramid and show awareness of modern CD practices where "testing in production" (safely) is a real technique. Key modern CD quality vocabulary: Shift left — move quality activities earlier in the development lifecycle (devs test, QA pairs during development). Quality gate — automated check that must pass before the pipeline proceeds. Feature flag / feature toggle — deploy code but control its activation, enabling testing in production on a subset of users. Canary deployment — route a small percentage of traffic (e.g., 5%) to the new version before full rollout. Limits blast radius of a bad deploy. Blue/green deployment — run two identical production environments; switch traffic instantly. Contract testing — verify that a consumer and provider agree on an API contract without full integration tests (Pact library). Observability as a quality feedback loop — real user impact (error rates, latency, conversion) is the ultimate quality signal. Option C is excellent and includes contract testing — a strong answer. Options A and D are weaker because they omit feature flags and canary deployments, which are the distinguishing practices in true CD environments.
5 / 5
The interviewer asks: "Describe a time when you found a critical bug late in the release cycle. How did you handle it?" Which answer best demonstrates professional QA communication and decision-making?
Option B is the strongest STAR answer: the Situation (payment feature, 3 days from go-live) is specific and high-stakes. The Task (document, escalate, recommend) is clear. The Action (impact estimate with data, three-option presentation, escalation to the right stakeholders) demonstrates professional seniority. The Result (option 2 chosen, validated in next cycle, post-mortem improvement) is complete. The critical differentiator: presenting options with trade-offs rather than just escalating the problem. Senior QA engineers don't just say "there's a bug"; they say "here are your options, here is the business impact of each, what do you decide?" Key vocabulary for late-cycle defect handling: P0/P1 severity — classification of bug severity. Business impact estimate — quantifying the effect in business terms (lost transactions, affected users). Feature flag / scope reduction — shipping a subset of the feature without the broken part. Escalation path — engineering lead, product owner, release manager. Post-mortem / retrospective action item — the quality improvement that prevents recurrence. Note that Option C is also a strong answer (it involves security testing which shows QA depth), but the less quantified impact and fewer presented options make it weaker for demonstrating senior-level QA judgment.