5 exercises on reading kubectl get events and kubectl describe output — understand ImagePullBackOff, CrashLoopBackOff, OOMKilled, and what each event means for pod health.
CrashLoopBackOff — container keeps crashing; Kubernetes retries with exponential back-off
OOMKilled / Exit 137 — container exceeded its memory limit; killed by the Linux kernel
Normal vs. Warning events — Warning = something unexpected; investigate
Restart Count — how many times the container has been restarted; high = ongoing problem
0 / 5 completed
1 / 5
⎈kubectl get events output
$ kubectl get events -n production --sort-by='.lastTimestamp'
LAST SEEN TYPE REASON OBJECT MESSAGE
2m Normal Scheduled pod/api-server-7d9b4c6f8-xk2p9 Successfully assigned production/api-server-7d9b4c6f8-xk2p9 to node-3
2m Normal Pulling pod/api-server-7d9b4c6f8-xk2p9 Pulling image "registry.example.com/api-server:v2.4.0"
90s Warning Failed pod/api-server-7d9b4c6f8-xk2p9 Failed to pull image "registry.example.com/api-server:v2.4.0": rpc error: code = Unknown desc = failed to pull and unpack image: failed to resolve reference "registry.example.com/api-server:v2.4.0": unexpected status code 401 Unauthorized
90s Warning Failed pod/api-server-7d9b4c6f8-xk2p9 Error: ErrImagePull
45s Warning BackOff pod/api-server-7d9b4c6f8-xk2p9 Back-off pulling image "registry.example.com/api-server:v2.4.0"
45s Warning Failed pod/api-server-7d9b4c6f8-xk2p9 Error: ImagePullBackOff
Read the kubectl get events output. What is the root cause of the ImagePullBackOff status on the pod?
HTTP 401 Unauthorized — registry credentials are missing or expired.
Reading the events chronologically:
Scheduled: Pod placed on node-3 — OK
Pulling: Kubernetes starts pulling api-server:v2.4.0 — OK
Failed: The registry returned unexpected status code 401 Unauthorized — the cluster does not have valid credentials to access registry.example.com
ErrImagePull: Kubernetes reports the pull failed
ImagePullBackOff: Kubernetes backs off and retries with increasing delays (30s, 1m, 2m, 5m...)
ImagePullBackOff vs. ErrImagePull:
ErrImagePull — the immediate failure on the first attempt
ImagePullBackOff — Kubernetes is in retry back-off mode after repeated failures. Not a new error — same root cause.
Common fixes for 401: Create or renew an imagePullSecret and reference it in the pod spec: kubectl create secret docker-registry regcred --docker-server=registry.example.com --docker-username=... --docker-password=...
The pod's Last State shows Reason: OOMKilled and Exit Code: 137. What does this tell a developer?
OOMKilled (Out Of Memory Killed) — the kernel killed the container for exceeding its memory limit.
Kubernetes enforces memory limits at the Linux cgroups level. When a container exceeds its limit (256Mi in this case), the Linux kernel's OOM (Out Of Memory) Killer sends a SIGKILL signal to the process.
Exit code 137 = 128 + 9:
Unix exit codes above 128 mean "killed by signal N" where N = exit code - 128
137 = 128 + 9 → killed by SIGKILL (signal 9), which is OOMKilled
Other important exit codes: 143 = 128 + 15 = SIGTERM (graceful shutdown); 1 = general error; 2 = misuse of shell command
Notice the limits: requests.memory: 128Mi — what Kubernetes reserves on the node limits.memory: 256Mi — the hard cap; exceeding this = OOMKilled
Fixes:
Increase the memory limit if the app legitimately needs more
Profile and fix memory leaks in the application code
Add pagination/chunking if the app loads large data sets into RAM
The pod's Restart Count is 8 and its current state is CrashLoopBackOff. What does CrashLoopBackOff mean in Kubernetes?
CrashLoopBackOff = the container keeps crashing; Kubernetes keeps restarting it with increasing delays.
Kubernetes's default restart policy is Always. When a container exits (for any reason), Kubernetes restarts it. If it keeps crashing, Kubernetes applies exponential back-off:
Restart 1: immediately
Restart 2: 10 seconds wait
Restart 3: 20 seconds wait
Restart 4: 40 seconds wait
... up to a maximum of 5 minutes between attempts
CrashLoopBackOff does NOT mean Kubernetes has given up — it will keep retrying. The back-off is to avoid hammering the node with rapid restarts.
In this case: The container OOMKills (uses too much memory), Kubernetes restarts it, it runs again, OOMKills again. With 8 restarts and OOMKilled as the reason, this is a memory leak or undersized limit problem — not a transient startup error.
How to diagnose:
kubectl logs <pod> --previous — logs from the last crashed container
kubectl describe pod <pod> — events and state (as shown)
Metrics: check memory usage trend with Prometheus/Grafana
$ kubectl get events -n production --sort-by='.lastTimestamp'
LAST SEEN TYPE REASON OBJECT MESSAGE
2m Normal Scheduled pod/api-server-7d9b4c6f8-xk2p9 Successfully assigned production/api-server-7d9b4c6f8-xk2p9 to node-3
2m Normal Pulling pod/api-server-7d9b4c6f8-xk2p9 Pulling image "registry.example.com/api-server:v2.4.0"
90s Warning Failed pod/api-server-7d9b4c6f8-xk2p9 Failed to pull image "registry.example.com/api-server:v2.4.0": rpc error: code = Unknown desc = failed to pull and unpack image: failed to resolve reference "registry.example.com/api-server:v2.4.0": unexpected status code 401 Unauthorized
90s Warning Failed pod/api-server-7d9b4c6f8-xk2p9 Error: ErrImagePull
45s Warning BackOff pod/api-server-7d9b4c6f8-xk2p9 Back-off pulling image "registry.example.com/api-server:v2.4.0"
45s Warning Failed pod/api-server-7d9b4c6f8-xk2p9 Error: ImagePullBackOff
The events show both TYPE: Warning and TYPE: Normal entries. What does an event TYPE: Warning indicate in Kubernetes?
Warning events signal something unexpected — investigate, but the cluster itself may still be running other workloads.
Kubernetes event types:
Normal — expected operations (scheduled, pulled, started, killed by user). These document normal lifecycle transitions.
Warning — something did not go as expected. The kubelet or controller tried something and failed, or detected a problem. Action may be required.
In this output:
Normal: Scheduled → pod placed on node-3 successfully
Normal: Pulling → image pull initiated
Warning: Failed → pull failed (401 Unauthorized)
Warning: BackOff → Kubernetes is in back-off mode retrying
Warning: Failed → ImagePullBackOff — still failing
Important nuance: A Warning event on one pod does not mean the entire namespace or cluster is broken. Other pods may be running fine. Warnings are scoped to the object they are attached to.
Useful commands:
kubectl get events --field-selector type=Warning — filter to warnings only
The container ran from 10:22:14 to 10:22:41 before being OOMKilled. What does this short runtime duration suggest about the nature of the memory problem?
27-second runtime before OOMKill suggests a startup-time memory spike or a very low limit relative to the app's baseline usage.
Interpreting the timeline:
Started: 10:22:14
Finished: 10:22:41
Duration: ~27 seconds
Restart count: 8 — this pattern repeats every restart
What this tells us:
The container does NOT run for a long time and gradually leak memory — that would take minutes or hours
Instead, the process either: ① Allocates a large data structure at startup (e.g., loads a full dataset into RAM) ② Has a very low limit (256Mi) relative to its actual steady-state memory footprint
Contrast with a slow memory leak:
Slow leak: container runs for hours/days, restart count grows slowly over time
Startup spike: container crashes within seconds every time — consistent short lifetime
Diagnostic steps:
kubectl logs worker-6b8c9d7f5-mnp12 --previous — see what the app logged before OOMKill