Kubernetes Operators Vocabulary: CRDs, Controllers, and Operator Patterns

Custom Resource Definitions, controllers, operator SDK, reconcile loop, and Kubernetes operator vocabulary for platform engineers.

If you work on platform engineering, site reliability, or DevOps at a company running Kubernetes, you have almost certainly heard someone say “we should write an operator for that.” But the vocabulary around Kubernetes operators is dense — CRDs, reconcile loops, finalizers, webhooks — and if you are not a native English speaker, the gap between understanding the technology and following a fast-paced architectural discussion can feel wide. This post covers the core terminology you need to participate confidently in those conversations.

Core Terms: The Building Blocks

Operator pattern — a method of extending Kubernetes by encoding the operational knowledge of a human expert (a “day-two” operator) into software. An operator watches your cluster for a specific type of resource and acts to keep things in the desired state, just as a human operator would.

“We keep doing the same manual steps every time we upgrade the database. Let’s encode that logic in an operator so the cluster can handle it automatically.”

“The operator pattern is really just ‘controllers plus domain knowledge’ — once you see it that way it clicks.”

Custom Resource Definition (CRD) — a Kubernetes API extension that lets you define your own resource types. A CRD tells Kubernetes “there is now a new kind of object called a PostgresCluster” (or whatever you choose). You create it once, and from that point the API server accepts objects of that kind.

“We need to register the CRD first before anyone can create a BackupPolicy object — the API server won’t know what to do with it otherwise.”

“The CRD is the schema; the custom resource is the actual instance. Don’t mix them up in the design doc.”

Custom resource (CR) — an instance of a CRD. Once you have defined a PostgresCluster CRD, each individual cluster you create — with its own name, namespace, and spec — is a custom resource.

“There are three custom resources deployed in production right now, each representing a separate Postgres instance managed by the operator.”

Controller — a control-plane component that watches Kubernetes objects and works to reconcile the current state of the cluster with the desired state. Every built-in Kubernetes feature (Deployments, Services, etc.) is backed by a controller, and operators add their own controllers on top.

“The controller is running in a crash loop — it is failing to connect to the API server, so none of the custom resources are being reconciled.”

“We wrote a custom controller that watches CertificateRequest objects and automatically calls our internal CA.”

Reconcile loop — the core function inside a controller. Kubernetes calls this function repeatedly (or whenever something changes) and passes it the current state. The controller’s job is to compare that state to the desired state and take whatever action is necessary to close the gap.

“Your reconcile loop should be idempotent — if it runs ten times on the same object it should produce the same result every time.”

“We added a log line at the start of the reconcile loop so we can see exactly how often it is being triggered.”

Desired state vs observed state — “desired state” is what you have declared (the spec in your YAML); “observed state” is what is actually running in the cluster (the status). The entire Kubernetes model is built on continuously driving observed state towards desired state.

“The observed state shows three replicas but the desired state is five — the controller should be scaling up, let’s check why it is not.”

“Kubernetes is declarative precisely because you express desired state and let the controllers handle how to get there.”

Frameworks and Tooling

kubebuilder — an official Kubernetes SIG project that provides scaffolding, code generation, and best-practice patterns for building operators in Go. Running kubebuilder init and kubebuilder create api generates the boilerplate controller, CRD manifests, and test harness for you.

“Use kubebuilder if you are writing the operator in Go — it handles all the scaffolding and keeps your project aligned with upstream conventions.”

“The kubebuilder book is dense but it is the canonical reference; skim it before your first operator sprint.”

Operator SDK — a framework from Red Hat (part of the Operator Framework project) that supports writing operators in Go, Ansible, or Helm. It wraps kubebuilder for Go operators and adds tooling for packaging and distributing operators via OLM.

“The Operator SDK gives you a higher-level abstraction than raw kubebuilder — useful if your team is more comfortable with Ansible than Go.”

“We used the Operator SDK’s Helm operator mode to wrap our existing Helm chart so it behaves like a proper operator.”

controller-runtime — the Go library that underlies both kubebuilder and the Operator SDK. It provides the Manager, reconciler interfaces, event filters, and client caches that most Go operators depend on. You rarely call it directly but it is always in the dependency tree.

“That bug is in controller-runtime’s caching layer — it is not caching status subresource updates, so our reconciler is acting on stale data.”

Manager — the top-level object in controller-runtime that owns the cache, the API client, and the lifecycle of all controllers running in a single operator binary. You typically create one manager per operator and register controllers with it.

“The manager is responsible for leader election — only one pod will actively reconcile at a time, which avoids split-brain issues.”

“We register all three controllers with the same manager so they share a single informer cache and reduce API server load.”

Distribution and Runtime Behaviour

Operator Lifecycle Manager (OLM) — a Kubernetes component that manages the installation, upgrade, and lifecycle of operators themselves. It introduces concepts like ClusterServiceVersion, Subscription, and InstallPlan so operators can be installed and updated in a controlled, auditable way.

“OLM handles operator upgrades — instead of kubectl apply-ing a new deployment manually, you update the subscription and OLM coordinates the rollout.”

“If OLM is not installed in the cluster, you cannot use OperatorHub — you would need to install operators manually.”

operatorhub.io — the public catalogue of community and certified operators, maintained by Red Hat and the broader Kubernetes community. It is the equivalent of a package registry but for operators.

“Before we build one from scratch, let’s check operatorhub.io — there is probably already a maintained operator for Kafka.”

Level-based trigger vs edge-based trigger — two approaches to deciding when a controller should act. Edge-based means “act when something changes”; level-based means “act based on the current state regardless of how you got there.” Kubernetes strongly recommends level-based because it is more resilient to missed events.

“Don’t write an edge-based controller — if the operator restarts it will miss events and never recover. Always use level-based logic in your reconciler.”

“Level-based is why your reconcile function should re-read state from the API server rather than relying on the event payload.”

Finalizer — a string placed in an object’s metadata.finalizers field that prevents Kubernetes from deleting the object until the named finalizer is removed. Controllers use finalizers to run clean-up logic (deleting cloud resources, revoking certificates) before the object disappears.

“We added a finalizer so that when someone deletes the DatabaseCluster CR, the operator has a chance to take a final backup before the cloud instance is terminated.”

“Never forget to remove your finalizer in the reconcile loop’s deletion path — otherwise your custom resources will get stuck in a Terminating state forever.”

Webhook (admission / mutating / validating) — HTTP callbacks that Kubernetes calls during the API request lifecycle. A mutating admission webhook can modify incoming objects (e.g., inject a sidecar). A validating admission webhook can reject objects that do not meet policy. Operators often ship webhooks alongside their controllers.

“The mutating webhook sets default values on the CR before it is stored, so the controller does not have to handle missing fields.”

“Our validating webhook rejects any BackupPolicy that does not specify a retention period — we enforce that at admission time rather than letting bad configs reach the controller.”

How to Use These Terms in Conversation

Once you know the vocabulary, the challenge is using it naturally in team discussions, code reviews, and architecture meetings. A few patterns that come up regularly:

  • When proposing automation: “Rather than scripting this in a pipeline, we could model it as a CRD and write a controller — that way the desired state lives in the cluster and the reconcile loop handles drift.”
  • When reviewing operator code: “Is this controller level-based? I want to make sure it recovers correctly if the operator pod is restarted mid-reconciliation.”
  • When debugging a stuck resource: “Check kubectl get <resource> -o yaml — if there is a finalizer in the metadata and the deletion timestamp is set, the controller’s clean-up path is not completing.”
  • When discussing distribution: “Do we ship this through OLM or just raw manifests? If the customer clusters all have OLM installed, operatorhub.io packaging is worth the effort.”

Quick Reference

TermOne-line definition
Operator patternEncoding human operational knowledge into a Kubernetes controller
CRDSchema that registers a new resource type in the Kubernetes API
Custom resource (CR)An instance of a CRD — your actual workload object
Reconcile loopThe idempotent function a controller calls to drive observed → desired state
FinalizerA guard that blocks object deletion until clean-up is confirmed
Mutating webhookAdmission hook that can modify objects before they are persisted
Validating webhookAdmission hook that can reject non-compliant objects
OLMOperator Lifecycle Manager — installs and upgrades operators themselves
kubebuilder / Operator SDKScaffolding frameworks for writing Go operators
Level-based triggerController acts on current state, not just change events — resilient to restarts