Learn the vocabulary of right-sizing a single container's own CPU and memory requests automatically.
0 / 5 completed
1 / 5
At standup, a dev mentions a Kubernetes component that automatically adjusts a container's CPU and memory requests based on its observed historical usage, rather than adding more replicas. What is this component called?
A VerticalPodAutoscaler, or VPA, automatically adjusts a container's CPU and memory requests based on its observed historical usage, rather than adding more replicas the way a horizontal autoscaler does. A HorizontalPodAutoscaler changes replica count, which is a fundamentally different scaling axis from resizing an individual container's own resource requests. This request-adjustment focus is what makes a VPA the right tool when a single container is persistently over- or under-provisioned rather than needing more copies of itself.
2 / 5
During a design review, the team wants the VPA to only recommend a new resource request value without automatically applying it, so an engineer can review the suggestion before it takes effect. Which capability supports this?
The VPA's recommendation-only update mode lets it surface a suggested new resource request value without automatically applying it, so an engineer can review the suggestion before it takes effect. Applying every recommended change automatically with no such mode removes that review step, which can matter for a workload where an unreviewed change carries real risk. This optional, recommendation-only mode is what lets a team adopt the VPA cautiously before trusting it to apply a change on its own.
3 / 5
In a code review, a dev notices that applying a new VPA-recommended resource request to a running pod requires that pod to be recreated, since a container's resource requests generally can't be changed in place. What does this represent?
The pod-recreation requirement means applying a new VPA-recommended resource request to a running pod requires that pod to be recreated, since a container's resource requests generally can't be changed in place on an already-running pod. Assuming the request can be applied with no recreation misunderstands a fundamental limitation of how Kubernetes handles a running container's resource requests. This recreation requirement is an important operational consideration, since it introduces a brief disruption whenever the VPA applies a change automatically.
4 / 5
An incident report shows a batch of pods was suddenly and repeatedly recreated throughout the day because the VPA was configured in auto-apply mode against a workload with highly variable usage, triggering a new recommendation, and a recreation, far too frequently. What practice would prevent this?
Configuring the VPA in recommendation-only mode, or tuning its update policy to apply a change less frequently, avoids recreating a pod every time a highly variable workload's usage triggers a new recommendation. Leaving it in auto-apply mode with no such tuning is exactly what caused the frequent, disruptive recreations this incident describes. This tuning is essential for any workload whose resource usage naturally fluctuates enough to otherwise trigger the VPA constantly.
5 / 5
During a PR review, a teammate asks why the team uses a VerticalPodAutoscaler for a persistently over-provisioned container instead of a HorizontalPodAutoscaler adding more replicas. What is the reasoning?
A HorizontalPodAutoscaler adds more copies of an already over-sized container, which multiplies the waste rather than fixing it. A VPA instead right-sizes that single container's own CPU and memory requests based on its actual observed usage, addressing the real problem directly. The tradeoff is the pod-recreation disruption a VPA-applied change introduces, which needs to be weighed against simply leaving a slightly over-provisioned container running as-is.