17 Chapter

Admission Controller

We cover the admission model, where the Kubernetes API server inspects and transforms a manifest just before storing it in etcd. We organize the two types, Mutating and Validating; the built-in controllers (LimitRanger · ResourceQuota · PodSecurity, etc.); the webhook mechanism; and a comparison of the two policy engines built on top of it, OPA Gatekeeper (Rego) and Kyverno (YAML).

Chapter 14 RBAC / NetworkPolicy / ResourceQuota organized how RBAC controls permissions on the K8s API, NetworkPolicy controls traffic between Pods, and ResourceQuota controls resource totals. This chapter focuses on the next layer of policy: policy that enforces the manifest itself. Rules like “you can’t create a container without limits,” “images must be pulled only from our ECR registry,” and “an owner label is required on every workload” can’t be expressed with RBAC alone. The place these rules go is the Admission stage of the K8s API server, and the two tools that insert a policy engine into that stage are OPA Gatekeeper and Kyverno.

By the end of this chapter you’ll have a clear path to enforcing operational policies like “no cluster-admin bundle,” “reject containers without limits,” and “enforce the owner label” at the manifest level. The full-fledged tool for blocking the antipattern we touched on in §“A common pitfall — too-broad ClusterRole” of Chapter 14 appears here.

The Admission stage — just before a manifest enters etcd #

The flow from the moment you type kubectl apply -f my-pod.yaml until that manifest is stored in etcd isn’t one simple line. The Kubernetes API server passes it through the following five stages in order.

The K8s API server's request-handling flow

1. Authentication               — who called
2. Authorization                — RBAC check. Can the caller use this verb/resource
3. Mutating Admission           — transform the manifest (defaulting, sidecar injection, etc.)
4. Validating Admission         — does the manifest satisfy policy
5. Store in etcd

Stages 3 and 4 are the Admission Controller, the subject of this chapter. Even a request that passed authentication and RBAC can be rejected at this stage, and the manifest itself can be transformed before being stored.

Mutating vs Validating #

The difference between the two kinds is clear.

Mutating Admission — transforms the manifest. Example: auto-injecting a sidecar container into every Pod, auto-filling missing labels, applying defaults. Several mutating controllers can be applied in sequence to the same object.
Validating Admission — only inspects the manifest. Pass or reject. No transformation happens. The important point is that it sees the final manifest after all mutating is done.

The order is always mutating → validating. Because the fully transformed manifest is handed to the inspection stage, a validating rule only needs to evaluate “does the final form satisfy policy.”

Built-in Admission Controllers #

Several admission controllers are already compiled inside the K8s API server. Let’s pin down the ones you often meet in a production cluster.

Controller	Kind	Role
`NamespaceLifecycle`	Validating	Block object creation in a namespace being deleted
`LimitRanger`	Mutating + Validating	Apply LimitRange defaults + reject violations
`ResourceQuota`	Validating	Reject when a ResourceQuota total is exceeded
`ServiceAccount`	Mutating	Auto-attach the default ServiceAccount to a Pod
`PodSecurity`	Validating	Enforce Pod Security Standards (1.25+ stable)
`DefaultStorageClass`	Mutating	Auto-fill the default SC on a PVC

The stage where the LimitRange of Chapter 11 resources.requests / limits and the ResourceQuota of Chapter 14 actually work is this admission stage. When a manifest exceeds a ResourceQuota total, the ResourceQuota admission controller rejects it at stage 4. Built-in controllers are enabled · disabled with the --enable-admission-plugins API server flag.

Webhook — stepping into the admission stage from outside #

Built-in controllers are embedded in the K8s code, so a user can’t change their definition. When an ops team wants to insert its own policy into the admission stage, it uses a Webhook. There are two kinds.

MutatingWebhookConfiguration — sends the manifest to an external HTTP service, and that service returns a transformed manifest.
ValidatingWebhookConfiguration — sends the manifest to an external HTTP service, and that service returns allow / deny.

The K8s API server looks at the result of this webhook call and either passes the request through unchanged, transforms it, or rejects it. Both OPA Gatekeeper and Kyverno are policy engines layered on top of this webhook mechanism. They don’t add a new admission kind to K8s; they’re tools that abstract the standard webhook for practical use.

OPA Gatekeeper — policy expressed in Rego #

OPA (Open Policy Agent) is a general-purpose policy engine used outside K8s too. You write policy in its own language called Rego, and the OPA engine evaluates that policy. Gatekeeper is a tool that wraps OPA in a K8s admission webhook.

Gatekeeper’s core objects are two.

ConstraintTemplate — the blueprint of a policy written in Rego. “Define this kind of policy”
Constraint — an instance of a ConstraintTemplate. “Apply this policy to which resources with which parameters”

The separation of these two is Gatekeeper’s model. You write the policy once in a ConstraintTemplate, then instantiate it multiple times by feeding parameters into that template. Note that both objects are implemented as K8s CRDs — a policy engine layered on top of the model of Chapter 18 the CRD and Operator pattern.

A ConstraintTemplate example — enforcing required labels #

constrainttemplate-required-labels.yaml

apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
  name: k8srequiredlabels
spec:
  crd:
    spec:
      names:
        kind: K8sRequiredLabels
      validation:
        openAPIV3Schema:
          type: object
          properties:
            labels:
              type: array
              items:
                type: string
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package k8srequiredlabels

        violation[{"msg": msg}] {
          required := input.parameters.labels
          provided := input.review.object.metadata.labels
          missing := required[_]
          not provided[missing]
          msg := sprintf("Missing required label: %v", [missing])
        }

The code inside the rego block is the real policy. input.review.object is the manifest at the admission stage, and input.parameters is the parameters passed from the Constraint. If violation[...] is non-empty, the manifest is rejected. Applying the ConstraintTemplate creates a new CRD called K8sRequiredLabels in K8s.

A Constraint example — an instance of the template above #

constraint-require-owner.yaml

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels
metadata:
  name: namespace-must-have-owner
spec:
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Namespace"]
  parameters:
    labels: ["owner", "team"]

Apply this Constraint and, from that moment, every newly created Namespace without the owner and team labels is rejected at the admission stage.

Attempt to create a Namespace without labels

$ kubectl create ns test
Error from server (Forbidden): admission webhook "validation.gatekeeper.sh" denied the request:
[namespace-must-have-owner] Missing required label: owner
[namespace-must-have-owner] Missing required label: team

Gatekeeper’s extra features #

Beyond policy evaluation, Gatekeeper has a few ops-friendly features.

dry-run / audit mode — applying a Constraint with enforcementAction: dryrun records only violations without rejecting. Used to measure the blast radius before enforcing a policy on a live service.
Restricting the evaluation target with the Config object — you can exclude system namespaces like kube-system from evaluation.
External data referrer — a Constraint can reference OPA’s data object to evaluate policy by looking at other K8s objects or external data.

Kyverno — policy expressed in YAML #

Kyverno is a tool in the same category as OPA Gatekeeper but with a different approach. Writing policy in YAML without learning a new language is Kyverno’s biggest distinguishing point. K8s users are already comfortable with YAML, so the barrier to adopting policy is low.

Kyverno’s three behaviors #

A Kyverno policy does one (or more) of three things.

validate — inspects whether the manifest satisfies a rule (Validating Admission)
mutate — transforms the manifest (Mutating Admission)
generate — automatically creates another object (a Kyverno-only feature)

generate isn’t a behavior of the admission stage itself, but it expresses patterns like “when a Namespace is created, auto-create a default NetworkPolicy inside it” in one policy. An operational pattern like auto-installing the default-deny + allow-dns bundle from Chapter 14 into a new namespace is a common example.

A validate example — reject containers without limits #

policy-require-limits.yaml

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-resource-limits
spec:
  validationFailureAction: Enforce
  rules:
    - name: require-cpu-memory-limits
      match:
        any:
          - resources:
              kinds: ["Pod"]
      validate:
        message: "Pod must have CPU and memory limits."
        pattern:
          spec:
            containers:
              - resources:
                  limits:
                    memory: "?*"
                    cpu: "?*"

The ?* inside pattern means “any value is fine, but it must not be empty.” When this policy is applied, every container of every new Pod must have both limits.cpu and limits.memory written. As a pattern that enforces the resource model covered in Chapter 11 at the admission level, pairing it with LimitRange’s defaultRequest / default lets you enforce the resource notation of the manifest.

A mutate example — auto-add a label to every Pod #

policy-add-labels.yaml

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: add-default-labels
spec:
  rules:
    - name: add-managed-by
      match:
        any:
          - resources:
              kinds: ["Deployment", "StatefulSet"]
      mutate:
        patchStrategicMerge:
          metadata:
            labels:
              managed-by: platform-team

When this policy is applied, even if the manifest has no managed-by label it’s automatically added at the admission stage. It’s a path to enforce a label standard without changing a single line of code.

Gatekeeper vs Kyverno — which to use #

Let’s organize the comparison of the two tools into one table.

Dimension	OPA Gatekeeper	Kyverno
Policy language	Rego (must learn anew)	YAML
Expressiveness	very high (Turing-complete Rego)	moderate (declarative pattern matching)
Learning curve	steep	low
Policy behaviors	validate, mutate (1.0+)	validate, mutate, generate, cleanup
Non-K8s policy	OPA itself can be used outside K8s	K8s-only
Policy library	rich (gatekeeper-library)	rich (kyverno/policies)

The choice usually follows this pattern.

If you can bear the learning burden of Rego and want to use the same policy engine outside K8s too, Gatekeeper is natural. It’s advantageous when a large organization wants to carry policy consistency across multiple systems.
If a K8s ops team wants to write and maintain policy itself, and the barrier to writing policy itself is the biggest cost, Kyverno is faster. The learning-cost difference in the first month of adoption is large.

Both tools have plenty of operational-scale track record. Unless you specifically need overwhelming expressiveness, the flow of considering Kyverno first and moving to Gatekeeper at the point where Rego’s expressiveness truly becomes necessary is also natural.

Principles to pin down in operations #

Let’s pin down a few principles you must definitely lock in on the operational side when adopting an admission webhook.

1. The two choices of failurePolicy — Fail vs Ignore #

This is the field that decides how the API server behaves when the webhook can’t be called (timeout, network outage, policy-engine Pod down).

failurePolicy: Fail — reject the request if the webhook can’t respond. The policy is never bypassed, but the policy engine’s availability is tied to the cluster’s overall availability. If the policy engine dies, you can’t bring up a new workload.
failurePolicy: Ignore — just pass if the webhook can’t respond. Availability is good but the policy is bypassed.

The orthodox operational approach is to split important policies into Fail and incidental policies into Ignore. And it’s basic to make the policy engine itself redundant (replicas of 2 or more) and protect it with a PodDisruptionBudget. The PDB of Chapter 30 Upgrade Strategy ties directly to this chapter’s webhook availability.

2. Excluding system namespaces with namespaceSelector #

Namespaces where K8s’s own workloads run, like kube-system and kube-public, are usually excluded from policy evaluation. It’s a safeguard that prevents the incident of cluster boot itself being blocked by policy.

An example of a webhook's namespaceSelector

namespaceSelector:
  matchExpressions:
    - key: kubernetes.io/metadata.name
      operator: NotIn
      values: ["kube-system", "kube-public", "kube-node-lease"]

3. Gradual adoption with dry-run #

If you apply a new policy to a production cluster straight in enforce mode, updates to existing workloads can break one after another. The standard flow is as follows.

The orthodox approach to adopting policy

1. Apply in dry-run mode (Gatekeeper's dryrun, Kyverno's Audit)
2. Collect violation logs for a period → measure the blast radius
3. Clean up the violating workloads first
4. Switch to enforce mode

Skip this cycle and existing manifests are rejected one after another, leading to the incident of GitOps sync grinding to a halt. Even if the policy’s intent is right, the adoption procedure must be gradual. The pattern of combining with GitOps is covered once more in Chapter 20 GitOps.

4. Monitoring webhook latency #

An admission webhook is on the critical path of every manifest change. If the policy engine slows down, kubectl apply slows down with it. Both Gatekeeper and Kyverno expose their own metrics, so it’s standard to tie P99 latency and rejection rate into the observability stack covered in Chapter 19 Observability.

Exercises #

Check the admission controllers applied to your cluster (kubectl get validatingwebhookconfigurations, kubectl get mutatingwebhookconfigurations). If Kyverno or Gatekeeper is among them, organize which ClusterPolicy / Constraint is applied into one table, and compare in one paragraph how it’s a different kind of policy from the RBAC · NetworkPolicy · ResourceQuota of Chapter 14.
Install Kyverno or Gatekeeper on a local cluster and apply the policy “every container of every Pod must have limits.cpu / limits.memory” in dry-run mode. Capture what message is recorded as a violation at the admission stage when you apply a Deployment missing limits, and record step by step how the same manifest is rejected when you switch the same policy to enforce mode. Organize into one paragraph how the responsibility of Chapter 11’s LimitRange differs from this chapter’s policy.
Write out, as a simulation, the scenario of a webhook set to failurePolicy: Fail failing to respond — what error occurs when you apply a new Deployment after all the policy-engine Pods have died, and how it would differ if it were failurePolicy: Ignore. Organize how the policy engine’s availability ties into the cluster’s availability with the model of §“The two choices of failurePolicy.”

In one line: before storing a manifest in etcd, the K8s API server passes it through the admission stage in this order: mutating (transform) → validating (inspect). Beyond the built-in controllers (LimitRanger · ResourceQuota · PodSecurity, etc.), you can insert an external policy engine via webhook. The two common choices are OPA Gatekeeper (Rego, strong expressiveness) and Kyverno (YAML, low barrier). Operational adoption depends on four principles: failurePolicy · excluding system namespaces · gradual dry-run adoption · monitoring webhook latency.

Next chapter #

We touched on the point that this chapter’s Gatekeeper and Kyverno both layer on top of new object kinds defined by K8s CRDs (CustomResourceDefinitions). Objects like ConstraintTemplate · K8sRequiredLabels · ClusterPolicy aren’t standard objects of K8s proper but custom resources the two tools defined as CRDs. The next chapter’s subject is that CRD itself — the path to extending the K8s API with new object kinds.

Chapter 18 the CRD and Operator pattern covers the manifest that defines a new object kind with a CustomResourceDefinition, the Operator pattern based on controller-runtime that operates that object, and the model of representative Operators you meet in operations (CloudNativePG · cert-manager · Argo CD, etc.). If up through this chapter was the depth of handling K8s’s standard objects, the next chapter is the starting point of the path to extending K8s itself to fit your own domain.