How to Write a Canary Release Report in English
Learn the English structure and vocabulary for writing a canary release report: metrics tracked, rollout decision, and the promote-or-rollback call.
A canary release only earns its purpose if someone writes down what was actually observed and why the rollout was promoted or rolled back — without that report, the next canary starts from zero instead of building on what was learned. This guide covers the English for writing one clearly.
Key Vocabulary
Canary percentage — the proportion of traffic or instances running the new version during the canary phase, which should be stated explicitly rather than assumed. “The canary was running at 5% of production traffic for two hours before we made the promote decision.”
Baseline comparison — the practice of comparing the canary’s metrics against the existing stable version’s metrics over the same time window, rather than judging the canary in isolation. “Error rate on the canary was 0.3%, compared to a 0.25% baseline on the stable version over the same window — close enough that we attributed the difference to normal variance.”
Guardrail metric — a specific metric with a predefined threshold that, if crossed, automatically or manually triggers a rollback, agreed upon before the canary starts rather than decided in the moment. “Our guardrail metric was p99 latency staying under 400ms — it stayed at 280ms throughout the canary window, well within the threshold.”
Promote decision — the explicit decision to roll the canary version out to full production traffic, made based on the guardrail metrics and baseline comparison rather than just elapsed time. “We made the promote decision after four hours, once guardrail metrics held steady and the sample size was large enough to be confident it wasn’t a fluke.”
Rollback decision — the explicit decision to revert the canary and return all traffic to the previous stable version, along with the specific metric or observation that triggered it. “The rollback decision was triggered by a guardrail breach — checkout error rate crossed 1% on the canary within twenty minutes, well above the 0.5% threshold we’d set.”
Sample size / statistical confidence — the amount of traffic or time the canary needs to run before its metrics are considered a reliable signal rather than noise, which should be stated in the report. “Two hours at 5% traffic gave us roughly 40,000 requests on the canary — enough sample size to trust the error rate comparison wasn’t just noise.”
Common Phrases
- “What canary percentage and duration are we reporting on here?”
- “How does this compare to the baseline over the same time window, not just in isolation?”
- “Which guardrail metric, if any, was breached — or did the canary stay within all agreed thresholds?”
- “Was the promote decision based on sufficient sample size, or could this still be noise?”
- “What specifically triggered the rollback, and at what point in the canary window?”
Example Sentences
Writing a promote report: “Canary ran at 10% traffic for three hours. Guardrail metrics (error rate, p99 latency) stayed within threshold throughout, and baseline comparison showed no meaningful degradation. Promote decision made at 14:00 with full confidence given a sample size of roughly 80,000 requests.”
Writing a rollback report: “Canary ran at 5% traffic starting 09:00. At 09:35, error rate on the canary crossed our 1% guardrail threshold (baseline was 0.2% over the same window). Rolled back immediately; root cause under investigation, tracked in INC-482.”
Requesting more rigor in a canary review: “Before we promote, I’d like to see the baseline comparison explicitly, not just the canary’s raw numbers — 0.4% error rate sounds fine until you realize the baseline was 0.1% for the same period.”
Professional Tips
- Always report the canary percentage and duration explicitly — “we ran a canary” without those numbers gives a reader no way to judge how much signal to trust.
- Use a baseline comparison, not an absolute number in isolation — a metric that looks fine on its own can still represent a real regression relative to what the stable version was doing.
- Agree on guardrail metrics and thresholds before the canary starts, and reference them by name in the report — deciding thresholds after seeing the data invites motivated reasoning.
- State the promote or rollback decision explicitly with its trigger or justification — a report that just lists metrics without a stated conclusion leaves the next reader to redo the analysis.
Practice Exercise
- Write a one-sentence summary of a canary running at a specific percentage and duration.
- Write a guardrail metric with a specific threshold for a hypothetical release.
- Write a rollback report stating what triggered the decision and when.