Why this hub exists: Reliability engineering vocabulary recurs across several categories on purpose — SLI, SLO, reliability, and canary all show up in more than one place because the underlying discipline is genuinely interconnected: you cannot set a target without measuring first, and you cannot verify resilience without also controlling how changes roll out. Coders Lingo covers this with five focused categories rather than one unfocused mega-category — this page is the map that explains how the pieces connect and what makes each one distinct.

The SRE English landscape, in plain terms

Broadly, the five categories below form a pipeline. Observability Engineering Language is the foundation — instrumenting a system with traces, metrics, and logs so you can see what it is doing. SLO & Error Budget Engineering Language turns that data into targets — SLIs become SLOs, and error budgets decide how much risk a team can take. Chaos Engineering Language and Progressive Delivery Language are the two practices that protect those targets: chaos engineering deliberately breaks things to verify resilience in general, while progressive delivery limits the blast radius of any one specific change through canaries and feature flags. Service Mesh Operations Language is the networking layer that often implements the traffic-splitting behind progressive delivery and produces much of the trace data observability tooling consumes — tying the other four together at the infrastructure level.

These categories are not duplicates of each other, even where terms like SLI, SLO, reliability, and canary recur across them. Each card below states in one line exactly what makes that category distinct, so you can see how the vocabulary connects rather than assuming overlap means repetition.

The 5 SRE & reliability vocabulary categories

Frequently asked questions

Why are there so many separate SRE and reliability vocabulary categories on Coders Lingo?

Site reliability engineering covers several genuinely distinct activities: instrumenting a system to see what it is doing, setting targets for how reliable it must be, deliberately breaking it to verify resilience, rolling out changes safely, and managing the network layer that connects services. Coders Lingo splits this into five focused categories rather than one unfocused mega-category, and shares recurring terms like SLI, SLO, reliability, and canary across them because the vocabulary is genuinely used across the discipline. This hub explains how the pieces fit together.

Which SRE category should I start with?

Start with Observability Engineering Language — tracing, metrics, and structured logging are the data foundation that SLO Engineering, Chaos Engineering, and Service Mesh Operations all assume you understand. From there, move to SLO & Error Budget Engineering Language to learn how that data becomes reliability targets, then Chaos Engineering Language and Progressive Delivery Language for the practices that protect those targets.

What is the difference between "Chaos Engineering" and "Progressive Delivery"?

Chaos Engineering Language is about deliberately injecting failure into a system to verify it survives — GameDays, fault injection, and resilience reporting. Progressive Delivery Language is about safely rolling out a new change — canary releases, blue-green deployments, and feature flags. Chaos engineering tests whether a system is resilient to failure in general; progressive delivery limits the blast radius of a specific new change. Many teams practice both.

Why do "SLO Engineering" and "Observability Engineering" share terms like SLI?

Because they are adjacent layers of the same discipline. Observability Engineering Language covers how you collect the underlying signals — traces, metrics, and logs, including the SLI as a measured indicator. SLO & Error Budget Engineering Language covers what you do with that indicator once collected: setting a target (the SLO), tracking an error budget, and deciding when to alert on burn rate. You cannot set a meaningful SLO without observability data feeding it, so the categories are sequential rather than duplicates.

How does "Service Mesh Operations" relate to the rest of this cluster?

A service mesh is infrastructure that often implements the traffic-splitting used in progressive delivery (canary routing) and produces much of the trace data consumed by observability tooling, in addition to its own mTLS and networking vocabulary. It is included here because engineers working across this cluster frequently need mesh vocabulary alongside SLOs, tracing, and chaos experiments, even though the mesh itself is a distinct networking layer.

How many total exercises are covered across the SRE and reliability vocabulary cluster?

The five categories in this hub cover 120 exercises in total, spanning instrumentation, target-setting, resilience testing, safe rollout strategy, and service mesh networking. Each category is self-contained, so you can start with whichever matches your current work.

Explore more

Browse the full exercise library for every other IT English topic.

All exercises