Certified Kubernetes Security Specialist (CKS) #18: Container immutability, forensics

In #17 Falco behavioral analysis, audit logs we covered how to detect anomalous behavior at runtime and leave an audit trail. Detection only tells you that “something is going wrong.” This post stays in the same Runtime Security domain but looks at what comes before and after — namely, immutability, which hardens a container so it can’t be changed in the first place, and forensics, the incident response after a breach has already happened.

Immutability and incident response are a pair. If you harden a container to be immutable, it becomes hard for an attacker to plant a binary or change configuration inside it, and even if a breach does occur, the question “what was originally there?” has a crisp answer, which makes investigation easier. On the exam these two come bundled together too: the task of applying immutability settings, and the procedure for isolating a compromised Pod and preserving evidence.

What is an immutable container #

An immutable container is a container whose contents do not change once it has started running. To fix code or change configuration, you don’t go inside the container and edit files — you build a new image and redeploy. You never touch a live container.

This mindset ties directly to security. When a container is compromised, the first thing an attacker wants to do is plant something inside it. They download and run a coin-mining binary, write a backdoor script to /tmp or a system path, or overwrite an existing binary with a malicious one. If the filesystem is read-only, every one of these write attempts fails.

The core setting that enforces immutability in Kubernetes is the container’s securityContext.readOnlyRootFilesystem: true. In #8 kernel hardening we covered the securityContext fields that strip privileges; of those, this field is the starting point of immutability.

What readOnlyRootFilesystem blocks #

Turning this field on mounts the container’s root filesystem read-only. The following attack patterns are blocked outright.

  • Downloading a malicious binary, e.g. curl ... -o /usr/local/bin/miner
  • Planting a backdoor in /etc/cron.d or a startup script
  • Overwriting an existing system binary with a malicious one
  • Leaving traces through shell history or temporary files

A read-only filesystem is also highly valuable for post-breach investigation. Because the root filesystem is guaranteed to be identical to the boot-time image, it becomes a clean baseline you don’t have to second-guess for tampering.

Applying readOnlyRootFilesystem #

The core is a single line. You add readOnlyRootFilesystem: true to the container’s securityContext.

apiVersion: v1
kind: Pod
metadata:
  name: immutable-web
spec:
  containers:
    - name: web
      image: nginx:1.27
      securityContext:
        readOnlyRootFilesystem: true
        runAsNonRoot: true
        allowPrivilegeEscalation: false
        capabilities:
          drop: ["ALL"]

This manifest hardens the root filesystem to read-only, runs as non-root, blocks privilege escalation, and drops all capabilities. It applies immutability and least privilege as one bundle.

Paths that need writes go to emptyDir #

The problem is that many applications need writes to operate. nginx has to write to /var/cache/nginx and /var/run, and some applications create temporary files in /tmp. If you make the entire root filesystem read-only, these legitimate writes are blocked too and the container won’t come up.

The solution is to pick only the directories that need writes and overlay a writable volume on them. Typically you mount an emptyDir volume at that path. An emptyDir exists only for the lifetime of the Pod and is emptied along with the Pod when it disappears, so even if an attacker plants something there, a single redeploy makes it vanish without a trace.

apiVersion: v1
kind: Pod
metadata:
  name: immutable-nginx
spec:
  containers:
    - name: web
      image: nginx:1.27
      securityContext:
        readOnlyRootFilesystem: true
        allowPrivilegeEscalation: false
        capabilities:
          drop: ["ALL"]
      volumeMounts:
        - name: cache
          mountPath: /var/cache/nginx
        - name: run
          mountPath: /var/run
        - name: tmp
          mountPath: /tmp
  volumes:
    - name: cache
      emptyDir: {}
    - name: run
      emptyDir: {}
    - name: tmp
      emptyDir: {}

The entire root filesystem is read-only, but the three directories — cache, run, and tmp — are opened as writable emptyDirs. Keeping read-only as the default and opening only the places that need it as exceptions is the standard form of immutable operation.

How to find which paths to open #

Often you don’t know in advance which directories need writes. The simplest approach is to first apply only readOnlyRootFilesystem: true, bring the container up, and then check the logs to see where a write was attempted and failed.

kubectl logs immutable-nginx
# e.g. check the path pointed to by "Read-only file system" or "Permission denied"

Open only the paths the failures point to, one emptyDir at a time, narrowing down until the container comes up cleanly. On the exam, well-known images like nginx come up often, so keeping /var/cache/nginx , /var/run , /tmp in mind gets you through quickly.

Forbidding in-place changes and preventing drift #

Immutability is not just a setting — it’s an operating principle too. You do not kubectl exec into a live container to edit files. This kind of in-place change creates drift, the problem where the state declared by the manifest and the actual container state diverge. As drift accumulates, no one can say with certainty “what exactly is running right now,” and that itself is a security gap.

The principle is simple.

  • Make code and configuration changes only by building a new image → redeploying
  • Do not enter a live container to edit files
  • Use kubectl exec only for investigation and debugging, never for changes

readOnlyRootFilesystem is the mechanism that enforces this principle technically. Even if someone goes in by mistake and tries to edit a file, the filesystem refuses it, structurally removing any room for drift to occur.

Stabilizing startup with startupProbe #

An immutable container only works correctly if all its write paths are properly opened at startup. If the application is slow to start or takes time to initialize, it’s safer to let startupProbe first decide whether startup has finished, and only then let livenessProbe and readinessProbe take over. No other probe kills the container until the startupProbe succeeds, which buys time for initialization to complete in a read-only environment.

      startupProbe:
        httpGet:
          path: /healthz
          port: 8080
        failureThreshold: 30
        periodSeconds: 5

This setting waits up to 150 seconds (30 × 5 seconds) for startup, during which no restart from a liveness failure occurs.

Forensics: handling a compromised Pod #

If a breach happens despite detection and prevention, you’re now in the territory of incident response. The core of forensics as CKS expects it is not fancy analysis tooling but procedure. Get the order wrong and evidence disappears or the attack spreads. There are two principles: contain it so it doesn’t spread, then preserve evidence before investigating.

1) Isolate: contain it so it doesn’t spread #

The first thing to do is cut off the compromised Pod so it can’t communicate with other Pods or the outside. But you must not delete the Pod right away. The moment you delete it, volatile evidence such as memory, processes, and temporary files disappears with it.

Start by cutting the network. Pick the label attached to the compromised Pod and apply a NetworkPolicy that blocks all ingress and egress. This is the default-deny pattern from #2 NetworkPolicy, aimed at a single Pod.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: quarantine-compromised
  namespace: prod
spec:
  podSelector:
    matchLabels:
      quarantine: "true"
  policyTypes:
    - Ingress
    - Egress

The podSelector picks the Pod carrying the quarantine label, and policyTypes includes Ingress and Egress but leaves no allow rules at all, blocking all communication. Attach the label to the compromised Pod with kubectl label pod <name> quarantine=true and it’s isolated immediately. This cuts off the attacker’s command-and-control (C2) communication and lateral movement.

Changing the label has another effect. If the Pod is managed by a Deployment, changing the Pod’s label to remove it from the selectors of the Service and ReplicaSet means traffic no longer goes to that Pod, and the controller spins up a fresh, healthy Pod. This is how you set the compromised Pod aside for investigation while keeping the service running.

Next, isolate the node. If there’s a chance the breach has spread to the node level, use kubectl cordon to stop new Pods from being scheduled onto that node.

kubectl cordon node-3      # block new Pod scheduling (existing Pods stay)

cordon doesn’t evict existing Pods — it only blocks new scheduling, so it preserves the evidence on the node while limiting the spread. drain can scatter evidence by evicting Pods, so before investigation it’s safer to apply only cordon.

2) Preserve evidence: capture it before you delete #

Once isolation is done, secure the evidence before touching the Pod. Grab the most volatile items first.

kubectl logs <pod> -n prod --all-containers --previous > evidence-logs.txt
kubectl describe pod <pod> -n prod > evidence-describe.txt
kubectl get pod <pod> -n prod -o yaml > evidence-spec.yaml
  • Logs: container stdout. --previous also captures logs from before a restart
  • Memory and processes: check the process list and memory state from the node via the container runtime
  • Files: tampered or added files inside the container. If you’ve turned on readOnlyRootFilesystem, you only need to look at the emptyDir areas

If the container runtime supports it, take a snapshot before stopping. The moment you stop or delete a container, its memory and running state are gone, so leaving an image of the live state as-is is fundamental to forensics.

# from the node, snapshot the current state to an image via the container runtime (e.g. containerd/Docker)
crictl ps                         # identify the compromised container ID
docker commit <container-id> evidence:incident-0001   # snapshot before stopping

3) Investigate: look inside with kubectl debug #

Once evidence is preserved, you investigate. With a readOnlyRootFilesystem or distroless (#13) image, there’s often not even a shell, and the tool for this case is kubectl debug. It attaches a temporary debug container to the same process namespace to investigate without affecting the compromised Pod.

kubectl debug -it <pod> -n prod \
  --image=busybox \
  --target=web \
  --share-processes

--target shares the process namespace of the container under investigation, and --share-processes lets you peer into that container’s processes from the debug container. Because you don’t inject a new binary into the compromised container itself, you can investigate without contaminating the evidence. If you need to investigate the node itself, use kubectl debug node/<node> to launch a debug Pod with the node filesystem mounted at /host.

Only after the investigation is finished and all evidence is secured do you delete the compromised Pod and redeploy from a clean image.

Exam points #

  • The one core line of immutability: the container’s securityContext.readOnlyRootFilesystem: true. It’s a container-level field, not a Pod-level one
  • Write paths: if the container won’t come up under read-only, mount the failing path as an emptyDir to open it. For nginx, /var/cache/nginx , /var/run , /tmp
  • Immutable operation: change only via redeploy. Never edit a live container. exec is for investigation
  • Isolation order: don’t delete — isolate first. Block communication with a quarantine label + default-deny NetworkPolicy, and cordon the node (not drain)
  • Evidence preservation: secure logs (--previous), memory, and files before deleting. Snapshot before stopping
  • Investigation tool: attach a debug container with kubectl debug (--target , --share-processes). For the node, kubectl debug node/<node>
  • Order is your score: isolate → preserve evidence → investigate → delete and redeploy. Get this order wrong and evidence disappears

Wrap-up #

What this post locked in:

  • An immutable container is one whose contents don’t change after it starts running. It becomes hard for an attacker to plant a binary or tamper with files, and a clean baseline is guaranteed
  • readOnlyRootFilesystem: true is the starting point of immutability. Open only the paths that need writes as exceptions via emptyDir
  • Drift prevention: change only via redeploy. Stabilize the slow startup of a read-only environment with startupProbe
  • Forensics procedure: don’t delete the compromised Pod — isolate it (NetworkPolicy , node cordon) → preserve evidence (logs , memory , files, snapshot before stopping) → investigate with kubectl debug → delete and redeploy
  • On the exam, applying immutability settings and the procedure for isolating a compromised Pod come bundled together, and keeping the order is the score

With this, we’ve worked through every technical domain, all the way to Monitoring, Logging, and Runtime Security — the last of the six domains.

Next — Exam tips #

You’ve locked in all the content. What’s left is the operation that pulls that content up to 67% within 2 hours.

In #19 Exam tips and time management, frequently missed patterns we’ll gather the shortcut and alias setup to run right at the exam start, the time management of skipping a hard task and coming back to it, how to quickly find the per-tool docs, and frequently missed patterns like the trap where you’ve applied readOnlyRootFilesystem but the container won’t come up.

X