Certified Kubernetes Security Specialist (CKS) #17 Falco behavioral analysis, audit logs (Runtime)

Infrastructure Kubernetes Container Orchestration Certification

Wednesday, June 3, 2026

12 min read

#16 Admission control: OPA/Gatekeeper, Kyverno dealt with the up-front controls that stop dangerous manifests before they enter the cluster. But you can’t block every attack at the door. When an attacker spawns a shell inside a normally deployed Pod, reads a sensitive file, or escalates privileges, it happens after the admission stage has already passed. This post covers the final domain — Monitoring, Logging, and Runtime Security — which is about detecting the abnormal behavior that already-running workloads exhibit at runtime.

Runtime security has two clear pillars: Falco, which watches the syscalls happening in the node’s kernel, and the audit log, which records who sent what request to the API server. Think of the former as observing behavior inside the container, and the latter as observing requests that came into the cluster’s control plane. Both are exam regulars, so we’ll get a feel for writing rules and policies by hand and reading the output.

What is runtime threat detection #

The domains so far have mostly been up-front controls. We blocked communication with NetworkPolicy, rejected dangerous Pods with PSA, and inspected manifests with admission webhooks. These controls act before an attack begins. But consider a situation like this.

A container deployed from a clean image is compromised through an unknown vulnerability
The attacker spawns /bin/bash inside that container to get an interactive shell
Inside the container, they read /etc/shadow or explore host directories

These actions are invisible at the manifest level. They happen inside a normal Pod that has already passed every entry control. Runtime detection observes the actual behavior of already-running workloads like this and catches the abnormal. The difference from up-front control is that the primary goal is to see and alert rather than to block.

Falco: a syscall-based rule engine #

Falco is CNCF’s runtime security tool. It receives Linux kernel system calls (syscalls) and Kubernetes audit events in real time, matches them against predefined rules, and emits violations as alerts. It catches the moment a container spawns a shell, opens a sensitive file, or escalates privileges, right there in the syscall flow.

Falco collects syscalls in two ways: it receives events directly from the kernel via a kernel module or an eBPF probe. Either way the same rule engine evaluates the result, so for the exam, the ability to read and write rules matters more than the collection driver.

Rule structure: rule, condition, output, priority #

A Falco rule is defined as a single YAML item. The key fields are as follows.

Field	Role
`rule`	Rule name. Shown in the alert
`desc`	Rule description
`condition`	An expression that decides which events count as a violation
`output`	A template defining what goes into the single alert line
`priority`	Severity (EMERGENCY–DEBUG)
`tags`	Classification tags

Let’s look at the most fundamental shell-execution detection rule.

- rule: Terminal shell in container
  desc: A shell was used as the entrypoint/exec target in a container
  condition: >
    spawned_process and container
    and shell_procs and proc.tty != 0
    and container_entrypoint
  output: >
    A shell was spawned in a container
    (user=%user.name container_id=%container.id
     container_name=%container.name shell=%proc.name
     parent=%proc.pname cmdline=%proc.cmdline)
  priority: NOTICE
  tags: [container, shell, mitre_execution]

The condition is a boolean expression. spawned_process is the event of a new process being spawned, and container is the condition that the event happened inside a container. Several conditions are joined with and to catch the pattern “a shell process was spawned inside a container.”

Making rules readable with macro and list #

shell_procs and container in the rule above are predefined macros. A frequently used piece of a condition is given a name and reused. A list gives a name to a bundle of values.

- list: shell_binaries
  items: [bash, sh, zsh, ksh, csh, ash, dash]

- macro: shell_procs
  condition: proc.name in (shell_binaries)

A list defines a bundle of shell binary names, and a macro gives the name shell_procs to the condition “the process name is in that bundle.” This keeps the rule’s condition short and readable. On the exam, when you modify an existing rule, the answer is often to add a single item to a list.

priority: severity levels #

priority indicates the severity of an alert, ordered from top to bottom as follows.

EMERGENCY  ALERT  CRITICAL  ERROR  WARNING  NOTICE  INFORMATIONAL  DEBUG

In the Falco configuration, you can set a priority threshold to filter so that only alerts at or above a certain level are output. The exam may give you an adjustment like “make only WARNING and above visible,” so it’s worth memorizing the level order.

Default rules: shell execution, sensitive file access, privilege escalation #

When you install Falco, /etc/falco/falco_rules.yaml contains validated default rules. The representative detection items are as follows.

Default rule	Behavior caught
Terminal shell in container	An interactive shell run inside a container
Read sensitive file untrusted	Reading sensitive files like `/etc/shadow`, `/etc/sudoers`
Write below etc	Writing files under `/etc`
Launch privileged container	Running a privileged container
Change thread namespace	A container attempting to escape into a host namespace
Mkdir binary dirs	Modifying binary directories like `/bin`, `/usr/bin`

These default rules alone detect common intrusion behavior broadly. The principle is to not edit the default rules file directly, because it gets overwritten on a Falco upgrade.

Custom rules: falco_rules.local.yaml #

When you add a rule or override an existing one, write it in /etc/falco/falco_rules.local.yaml. Falco reads the default rules first and then the local file, so the local file applies later and safely augments or redefines the default rules.

For example, let’s add a custom rule that detects writes to a specific directory.

# /etc/falco/falco_rules.local.yaml
- rule: Write to app config dir
  desc: Detect any write attempt under /app/config
  condition: >
    open_write and container
    and fd.name startswith /app/config
  output: >
    Write under /app/config detected
    (user=%user.name file=%fd.name
     container=%container.name command=%proc.cmdline)
  priority: WARNING
  tags: [filesystem, custom]

open_write is a macro that catches the event of opening a file in write mode, and fd.name is the target file path. startswith compares the path prefix. After adding a rule, you have to make Falco reload.

# restart if running as a systemd service
systemctl restart falco

# check rule syntax first
falco -V -r /etc/falco/falco_rules.local.yaml

Reading the output: which Pod, process, syscall #

A single Falco alert line comes filled with the fields you put in the output template. The actual output looks like this.

14:32:07.991 Notice A shell was spawned in a container
(user=root container_id=3f2a1b container_name=nginx-app
 shell=bash parent=runc cmdline=bash -i)

What you need to read from this single line is clear.

Which container: container_name=nginx-app, container_id=3f2a1b
What process: shell=bash, cmdline=bash -i confirms it’s an interactive shell
Who ran it: user=root
Parent process: parent=runc

On the exam, you get tasks like “find the name of the Pod where a shell was spawned in the Falco log and write it to a file.” Falco logs usually go to /var/log/syslog, journalctl -u falco, or a separate file depending on the configuration, so we’ll start by checking the output location.

# check Falco alerts in the systemd journal
journalctl -u falco --no-pager | grep "shell was spawned"

# example of extracting only the container name where a shell was spawned
journalctl -u falco --no-pager \
  | grep "Terminal shell in container" \
  | grep -oP 'container_name=\K[^ )]+'

Let’s collect the frequently used field names.

Field	Meaning
`proc.name`	Process name
`proc.cmdline`	Command line that was run
`proc.pname`	Parent process name
`user.name`	The user who ran it
`fd.name`	Path of the accessed file
`container.name`	Container name
`container.id`	Container ID
`k8s.pod.name`	Pod name (Kubernetes metadata)
`evt.type`	Syscall type

audit log: Kubernetes API auditing #

If Falco watches behavior in the node kernel, the audit log records the requests that came into the Kubernetes API server. It captures who requested what action (get/create/delete) on which resource, and what the result was. The answers to questions like “who read this Secret” and “which ServiceAccount created the Pod” are here.

audit policy: what and how much to record #

The audit log is defined by the audit policy. Recording every request would make the log explode, so the policy filters which requests to keep and at what level. The policy file usually lives at /etc/kubernetes/audit-policy.yaml.

There are four recording levels.

level	Recorded content
`None`	Not recorded
`Metadata`	Request metadata only (who, what, when). Body excluded
`Request`	Metadata + request body
`RequestResponse`	Metadata + request body + response body

There is also a stage indicating the point at which a request is processed.

stage	Point
`RequestReceived`	Right after the request is received
`ResponseStarted`	When the response starts being sent (mainly watch)
`ResponseComplete`	When the response finishes
`Panic`	When an internal panic occurs

audit policy example #

The policy evaluates rules top to bottom, and the level of the first matching rule applies. So rule order matters. The following is a form that gets varied often on the exam.

# /etc/kubernetes/audit-policy.yaml
apiVersion: audit.k8s.io/v1
kind: Policy
# omit recording all RequestReceived stages (reduce noise)
omitStages:
  - "RequestReceived"
rules:
  # keep only metadata for Secrets and ConfigMaps
  - level: Metadata
    resources:
      - group: ""
        resources: ["secrets", "configmaps"]

  # Pod changes in a specific namespace down to the body
  - level: Request
    namespaces: ["prod"]
    resources:
      - group: ""
        resources: ["pods"]
    verbs: ["create", "update", "delete"]

  # do not record read-only system requests
  - level: None
    users: ["system:kube-proxy"]
    verbs: ["watch", "get"]

  # all other requests, metadata only
  - level: Metadata

The policy above keeps only metadata for Secrets and ConfigMaps, the body for Pod changes in the prod namespace, ignores kube-proxy’s reads, and keeps metadata only for the rest. If you drop the final catch-all rule (level: Metadata), requests that don’t match won’t be recorded, so be careful.

apiserver flag configuration #

Once you’ve created the policy file, you have to turn on flags so the API server uses that policy. On a kubeadm cluster, you edit /etc/kubernetes/manifests/kube-apiserver.yaml directly.

Flag	Role
`--audit-policy-file`	Path of the audit policy file to apply
`--audit-log-path`	Path of the file to write the audit log to
`--audit-log-maxage`	Days to retain log files
`--audit-log-maxbackup`	Number of log files to keep
`--audit-log-maxsize`	Log file rotation size (MB)

kube-apiserver is a static Pod, so you have to mount the policy file and log directory as hostPath volumes for them to be accessible inside the container. Missing this mount, which keeps the apiserver from starting, is the most common mistake on the exam.

# /etc/kubernetes/manifests/kube-apiserver.yaml (excerpt)
spec:
  containers:
    - name: kube-apiserver
      command:
        - kube-apiserver
        - --audit-policy-file=/etc/kubernetes/audit-policy.yaml
        - --audit-log-path=/var/log/kubernetes/audit/audit.log
        - --audit-log-maxage=7
        - --audit-log-maxbackup=2
        - --audit-log-maxsize=50
        # existing flags...
      volumeMounts:
        - name: audit-policy
          mountPath: /etc/kubernetes/audit-policy.yaml
          readOnly: true
        - name: audit-log
          mountPath: /var/log/kubernetes/audit/
          readOnly: false
  volumes:
    - name: audit-policy
      hostPath:
        path: /etc/kubernetes/audit-policy.yaml
        type: File
    - name: audit-log
      hostPath:
        path: /var/log/kubernetes/audit/
        type: DirectoryOrCreate

When you save the manifest, the kubelet automatically restarts the apiserver Pod. It takes time to come back up, so we’ll check whether the apiserver started normally with crictl ps or kubectl get pods -n kube-system. If it fails to start, first suspect the directory permissions on the log path or a missing hostPath mount.

Log analysis #

The audit log is a format where one JSON record goes on each line. Let’s look at one entry.

{
  "kind": "Event",
  "level": "Metadata",
  "stage": "ResponseComplete",
  "requestURI": "/api/v1/namespaces/prod/secrets/db-cred",
  "verb": "get",
  "user": { "username": "dev-user" },
  "objectRef": {
    "resource": "secrets",
    "namespace": "prod",
    "name": "db-cred"
  },
  "responseStatus": { "code": 200 }
}

From this single record you read “dev-user read the db-cred Secret in the prod namespace and succeeded.” On the exam you get tasks of filtering for specific conditions with jq.

# extract only the users who accessed a specific Secret
jq 'select(.objectRef.resource=="secrets"
      and .objectRef.name=="db-cred")
      | .user.username' \
  /var/log/kubernetes/audit/audit.log

# view only delete actions in chronological order
jq 'select(.verb=="delete")
      | {time:.requestReceivedTimestamp,
         user:.user.username,
         res:.objectRef.resource}' \
  /var/log/kubernetes/audit/audit.log

Get just one pattern into your hands — filtering with jq’s select and pulling only the fields you want with object notation — and you’ll finish most analysis tasks quickly.

Exam points #

Let’s collect the tasks that frequently show up hands-on in the runtime domain.

Anomaly detection with Falco rules. Reading an existing rule and grasping what behavior it catches, or widening the detection scope by adding an item to a list
Extracting information from Falco output. Finding the container/Pod name where a shell was spawned or the user who ran it in the log and writing it to a designated file. Start by checking the log location (journalctl -u falco or the config file path)
Writing custom rules. Adding a rule to /etc/falco/falco_rules.local.yaml and reloading Falco to apply it
Writing an audit policy. Writing the level and rule order precisely for the required resources/verbs/namespaces. Not dropping the final catch-all rule
Enabling apiserver auditing. Handling the --audit-policy-file and --audit-log-path flag additions together with the hostPath volume mount. Always verifying the apiserver comes back up
audit log analysis. Filtering by specific user/resource/verb conditions with jq’s select to find the answer

The two most common point losses are: writing the audit policy well but dropping the hostPath mount in the apiserver manifest so the apiserver dies, and adding a Falco rule but not reloading so it doesn’t take effect. Don’t forget that for policies and rules, applying them and verifying they started is part of the one task.

Wrap-up #

What this post locked in:

Runtime threat detection sees the behavior that up-front controls missed. It catches shell execution, sensitive file access, and privilege escalation inside a normally deployed Pod in real time
Falco is a syscall-based rule engine. The rule/condition/output/priority structure, reusing conditions with macro and list, leaving the default rules untouched and writing custom ones in falco_rules.local.yaml
Falco output packs the container, process, user, and syscall into one line. Find the log location first, then extract the fields
The audit log records API server requests. The level (None/Metadata/Request/RequestResponse), stage, and rule order are the core, with the catch-all rule placed last
Enabling apiserver auditing is a bundle of flags + hostPath mount + startup verification. A missing mount is the most common mistake
Log analysis is mostly solved with the single pattern of jq’s select

Next — Container immutability #

We’ve gotten a feel for detecting abnormal behavior at runtime. But a more fundamental response than detection is making the container unmodifiable in the first place.

#18 Container immutability, forensics covers locking a container’s filesystem read-only with readOnlyRootFilesystem, security context settings for immutable containers, and the basics of forensics — collecting and analyzing evidence after a breach — to wrap up the runtime domain.