Certified Kubernetes Application Developer (CKAD) #12 Observability: logging, kubectl debug, port-forward, ephemeral container

In #11 Probes, you learned how to report an app’s health to the cluster with liveness and readiness. But when a probe fails or a Pod keeps restarting, you have to look directly at what’s happening inside to find the cause. Observability is a 15% domain on the CKAD exam, but it’s also the chokepoint you have to pass through whenever you get stuck solving any of the other problems.

This post organizes the tools you pull out when an app misbehaves, command by command. We go in this order: read the logs, check state and events, get inside the container, pull a port over to local, and debug even a shell-less container.

Logs: the first place you look #

When an app acts up, the first thing you look at is the logs. kubectl logs shows you exactly what the container sent to standard output (stdout) and standard error (stderr).

# Basic logs
k logs mypod

# Follow in real time (-f = follow)
k logs -f mypod

# Only the last 50 lines
k logs mypod --tail=50

# Only the last 10 minutes' worth
k logs mypod --since=10m

-f follows the logs in real time, used when you want to watch what an app prints while it handles requests. --tail and --since slice out just the part you need when the log is long.

The previous logs of a restarted container #

If a Pod has fallen into CrashLoopBackOff, you need to look not at the logs of the container that’s up now, but at the logs of the container right before it died to find the cause. That’s when you use --previous.

# Logs of the container that terminated just before
k logs mypod --previous

# Shorthand is -p
k logs mypod -p

When debugging a Pod that crashes repeatedly, --previous is an option you use often in the practical exam. The current container has usually just started and its log is often empty.

Logs of a multi-container Pod #

If a single Pod has multiple containers, you have to specify which container’s logs you want with -c. If you don’t, you get an error asking you to pick a container.

# Just a specific container
k logs mypod -c sidecar

# All containers at once
k logs mypod --all-containers=true

# Logs of multiple Pods at once by label
k logs -l app=web --all-containers=true --tail=20

--all-containers shows the combined logs of every container in the Pod, and if you pass a label with -l, you can follow the logs of all Pods carrying that label at once.

State and events: causes that don’t show up in logs #

Logs are the story after the container has started running. Problems that happen before the container comes up or at the cluster level — image pull failures, scheduling failures, probe failures — don’t show up in logs. For those, you look at describe and events.

# A Pod's detailed state and recent events
k describe pod mypod

The Events section at the very bottom of k describe pod’s output is the key part. Messages like Failed to pull image, Liveness probe failed, and Insufficient memory appear there verbatim. Read the container’s State, Last State, Exit Code, and Reason alongside them.

# All events in the namespace, in chronological order
k get events --sort-by=.lastTimestamp
# Only the events for a specific Pod
k get events --field-selector involvedObject.name=mypod

Adding --sort-by=.lastTimestamp sorts events by time so you can see at a glance what just happened. Without sorting, the order is scrambled and hard to read.

Checking an object’s actual definition #

When you’re not sure a manifest was applied the way you intended, pull the actual object stored in the cluster as YAML.

# The full applied definition as YAML
k get pod mypod -o yaml

# Just a specific field via jsonpath (image, status phase, etc.)
k get pod mypod -o jsonpath='{.spec.containers[*].image}'
k get pod mypod -o jsonpath='{.status.phase}'

-o yaml shows everything, including defaults and fields the controllers filled in, so it’s useful for checking the difference between the manifest you wrote and the actual applied result.

Getting inside the container #

When logs and state still don’t pin down the cause, you go inside the container and check directly. kubectl exec runs a command in a running container.

# Open an interactive shell (-it = interactive + tty)
k exec -it mypod -- sh

# Run a single command only
k exec mypod -- env
k exec mypod -- cat /etc/config/app.conf

# For multi-container, specify the container with -c
k exec -it mypod -c sidecar -- sh

Put the command to run after --. Instead of opening a shell, it’s often faster to run a single command — checking environment variables with env, or reading a mounted config file with cat. If the container doesn’t have bash, open a shell with sh. Inside, you typically check whether environment variables were injected correctly from ConfigMaps and Secrets (env), whether mounted volumes are at the expected path (ls, cat), and whether you can reach another Service (wget -qO- http://svc:80, nslookup svc).

Port forwarding: test directly from local #

When you want to reach a Pod or Service inside the cluster without exposing it externally, you use kubectl port-forward. It passes requests that arrive on a local port straight through to the target.

# Pod's port 80 to local 8080
k port-forward pod/mypod 8080:80

# A Service target works too
k port-forward svc/web 8080:80

# Targeting a Deployment connects to one of its backing Pods
k port-forward deploy/web 8080:80

With this running, you can check the response from another terminal with curl http://localhost:8080. It’s useful for distinguishing whether a Service’s selector or targetPort is misconfigured so traffic isn’t reaching the Pod, or whether the app itself can’t respond. port-forward stays alive only while the command is running.

ephemeral container: debugging a shell-less container #

Modern production images, like distroless, often have no shell and no debugging tools, built that way for security and size. k exec -- sh doesn’t work on such a container — there’s no shell at all. That’s when you attach an ephemeral container with kubectl debug.

An ephemeral container is a container added temporarily to a running Pod. It leaves the original container alone and slips another image carrying tools into the same Pod so you can look inside together.

# Attach busybox as a temporary container and open a shell
k debug -it mypod --image=busybox

# Share the target container's process namespace with --target
k debug -it mypod --image=busybox --target=app

--target=app makes the ephemeral container share the process namespace of the original container named app. Then, from inside the temporary container, you can look at the original container’s processes (ps) and files, so even a shell-less container can be debugged. To experiment on a copy without touching the original, create a replica with --copy-to=mypod-debug and debug that.

Node debugging #

When you need to look at the node itself rather than a Pod, you spin up a debug Pod on the node.

# A debug Pod with the node's filesystem mounted at /host
k debug node/node01 -it --image=busybox

This command mounts the node’s filesystem under the container’s /host, so you can directly check the node’s logs and config files.

Viewing resource usage #

When a Pod dies with OOMKilled or slows down, you need to look at CPU and memory usage. kubectl top shows real-time usage.

# Per-Pod usage (--containers for per-container granularity)
k top pod
k top pod mypod --containers

# Per-node usage
k top node

k top only works if metrics-server is installed in the cluster. Without it, you get a Metrics API not available error. The practical exam environment usually has metrics-server ready, but on a local cluster you spun up yourself, you have to install it separately. k top pod is the starting point when you’re judging whether usage is hitting limits and getting throttled, or dying from OOM.

The troubleshooting flow #

Rather than memorizing the tools separately, learning them in an order that starts from the symptom is faster on the exam.

  1. Check the symptom: look at state with k get pod. Figure out what it is — Pending, ImagePullBackOff, CrashLoopBackOff, or Running but READY 0/1, and so on.
  2. describe: read the Events and the container’s State and Exit Code with k describe pod mypod. Scheduling, image, and probe problems usually surface here.
  3. logs: if the container came up, look at the app logs with k logs mypod. If it’s restarting, look at the previous logs with --previous.
  4. exec / debug: if logs don’t catch it, check environment variables, files, and connectivity from inside with k exec -it mypod -- sh. If there’s no shell, attach an ephemeral container with k debug.
  5. port-forward / top: narrow down network paths with k port-forward and resource limits with k top.

You only move to the next step when the previous one didn’t reveal the cause. Most problems end at step 2 (describe) and step 3 (logs).

Exam points #

  • --previous is the heart of crash debugging. A CrashLoopBackOff Pod has an empty current-container log, so you look at the previous log.
  • -c is mandatory for multi-container. Without it you get an error or only the first container is picked.
  • k get events --sort-by=.lastTimestamp sorts events chronologically so you can quickly find what just happened.
  • Remember the shape of k debug --image=... --target=.... A shell-less container like distroless can’t be entered with exec; you debug it with an ephemeral container.
  • Don’t confuse the target notation of k port-forward (pod/, svc/, deploy/) and the local:target port order.
  • Remember that k top depends on metrics-server. If it doesn’t work, the problem may be the environment, not the tool.
  • describe and logs finish 90% of it. When stuck, these are the first two commands you type.

If you want to see observability in a broader operational context, the troubleshooting post of the K8s practical track covers cluster-level diagnostics as well.

Wrap-up #

What this post locked in:

  • Logs. Pick the right logs for the situation with k logs’s -f, --previous, -c, --tail, --since, and --all-containers.
  • State and events. Find causes that don’t show up in logs with k describe pod’s Events, k get events --sort-by=.lastTimestamp, and k get pod -o yaml.
  • Getting inside. Check environment variables, files, and connectivity with k exec -it -- sh and single commands.
  • Port forwarding. Test the response directly from local without exposure using k port-forward.
  • ephemeral container. Debug shell-less containers and even nodes with k debug --image=... --target=....
  • Resource usage. Look at CPU and memory with k top pod and k top node (requires metrics-server).
  • The flow. Narrow down in the order: check symptom → describe → logs → exec/debug → port-forward/top.

Next: ConfigMap and Secret in depth #

You now have the tools to look inside an app. Next we go into the largest domain — injecting configuration and secret values into that app.

In #13 ConfigMap and Secret in depth: volume vs env, auto-refresh, we’ll build it all ourselves: the difference between putting ConfigMaps and Secrets in as environment variables versus mounting them as volumes, the mechanism by which a volume mount automatically reflects value changes, the encoding and types of Secrets, and the injection formats that come up often on the exam.

X