Practise the language of A/B testing ML models in production: control vs treatment, traffic splitting, guardrail metrics, and statistical significance.
0 / 5 completed
1 / 5
In a model A/B test, the existing model serving the baseline experience is called the ___ group.
The control group receives the current (champion) model so its behaviour is the baseline you compare the new model against.
2 / 5
The new model variant whose impact you want to measure is served to the ___ group.
The treatment (or variant) group is exposed to the new model so you can attribute differences in metrics to the model change.
3 / 5
Sending 5% of traffic to a new model and 95% to the current one is an example of ___.
Traffic splitting routes a defined fraction of requests to each variant so you can run a controlled comparison at limited risk.
4 / 5
A metric that must NOT degrade (e.g. latency, error rate) even if the primary metric improves is called a ___ metric.
Guardrail metrics protect against harmful side effects; a treatment that boosts clicks but breaks latency guardrails should not ship.
5 / 5
Concluding the treatment is better when the difference could be due to chance is a failure to reach ___.
Statistical significance tells you the observed difference is unlikely to be random noise, which is required before declaring a winner.