English for KEDA Autoscaling

Learn the English vocabulary for KEDA (Kubernetes Event-Driven Autoscaling): scalers, trigger metrics, cooldown periods, and scale-to-zero.

KEDA discussions often get muddled with standard Kubernetes autoscaling vocabulary, but KEDA’s whole value is scaling on external event sources — queue depth, message lag — rather than CPU or memory, so precision about which metric is actually driving a scale event matters for debugging.

Key Vocabulary

Scaler — a KEDA component that connects to a specific external system (a queue, a stream, a database) and reports a metric KEDA can scale on, such as queue length or consumer lag. “We’re using the Kafka scaler to scale workers based on consumer group lag, not the generic CPU metric, since CPU usage doesn’t correlate well with backlog size for this workload.”

ScaledObject — the custom resource that ties a scaler’s trigger to a target deployment, defining thresholds, min/max replicas, and polling interval. “The ScaledObject was capped at a max of 3 replicas, which is why scaling stalled even though the queue depth kept climbing.”

Trigger metric — the specific value a scaler reports and that KEDA compares against a threshold to decide whether to scale up or down, such as queueLength or lagThreshold. “The trigger metric here is Redis queue length, so a spike in message volume — not request latency — is what causes this deployment to scale.”

Cooldown period — the amount of time KEDA waits after activity drops below the threshold before scaling a deployment back down, used to avoid rapid scale-up/scale-down thrashing. “We increased the cooldown period to 5 minutes because the deployment was scaling to zero and immediately back up every time there was a brief lull in traffic.”

Scale-to-zero — KEDA’s ability to scale a deployment down to zero replicas when there’s no work to process, and scale back up from zero once a trigger condition is met, something the standard Horizontal Pod Autoscaler can’t do. “This worker scales to zero overnight since the queue is empty, and KEDA brings it back up within seconds once a job arrives — that’s the main reason we chose KEDA over the built-in HPA.”

Common Phrases

  • “Which scaler is actually driving this — the queue length one, or is CPU still in the mix?”
  • “Is the ScaledObject’s max replica count the bottleneck, or is the trigger threshold itself too conservative?”
  • “What’s the cooldown period set to — is that why it’s flapping between zero and a few replicas?”
  • “Is this deployment configured to scale to zero, or does it always keep a minimum replica running?”
  • “Is the trigger metric polling frequently enough, or is there a lag between the queue spiking and the scale event firing?”

Example Sentences

Debugging a scaling delay: “Jobs were piling up because the ScaledObject’s polling interval was 30 seconds and max replicas was capped at 2 — by the time it scaled up, the backlog had already grown far past what 2 replicas could clear.”

Explaining a cost optimization: “We moved this batch worker to KEDA with scale-to-zero, since it only runs a few times a day — it was costing us a running replica 24/7 under the old HPA setup for no reason.”

Describing a trigger choice in a design doc: “We’re scaling on RabbitMQ queue depth rather than CPU, because CPU stays flat under this workload even when the queue backs up significantly.”

Professional Tips

  • Name the specific scaler in play when discussing autoscaling behavior — “it’s not scaling” is far less useful than “the Kafka scaler’s lag threshold hasn’t been crossed yet.”
  • Check the ScaledObject’s min/max replicas before assuming a trigger problem — a correctly firing trigger can still look broken if it’s capped too low.
  • Mention the cooldown period explicitly when a deployment appears to be flapping — this single setting is the most common cause of rapid scale up/down cycles.
  • Use scale-to-zero deliberately in cost discussions — it’s KEDA’s headline advantage over the standard HPA and worth naming directly when justifying the choice.

Practice Exercise

  1. Explain what a scaler does and how it differs from KEDA’s ScaledObject.
  2. Describe a scenario where a short cooldown period would cause scaling to flap.
  3. Write a sentence explaining why scale-to-zero matters for cost, not just performance.