Certified Kubernetes Administrator (CKA) #10 Workloads 1: Deployment in Depth, ReplicaSet, Rolling Update and Rollback

In #9 RBAC we covered how to grant least privilege to users and ServiceAccounts. From this post on, we move into the exam’s Workloads and Scheduling domain (15%). Its first topic is the Deployment, the workload an operator touches most often in a cluster.

You bring an app up, scale replicas out when traffic grows, update to a new image with zero downtime, and roll back to the previous version when something goes wrong. These four are the daily routine of operations, and a single Deployment handles all of them. If you have seen the Deployment from the manifest-writing angle in CKAD, this time we look into rollout and rollback in depth from the operator’s angle of handling it quickly with kubectl.

The Deployment → ReplicaSet → Pod hierarchy #

The starting point for understanding a Deployment is the fact that it does not create Pods on its own. A Deployment creates a ReplicaSet, and the ReplicaSet creates Pods. That is, it is a three-tier hierarchy.

Deployment   (declares the desired state + manages rollout history)
   └─ ReplicaSet   (maintains as many Pods as the specified replica count)
        └─ Pod ... Pod   (the actual unit running containers)

The role of each tier is clear.

ResourceResponsibility
DeploymentDeclares the desired state, swaps in a new ReplicaSet on update, and keeps previous versions as revisions
ReplicaSetAlways keeps the number of Pods with the labels it manages at the specified replica count
PodThe smallest unit where containers actually run

When an update happens, the Deployment creates a new ReplicaSet and scales up its Pods while scaling down the Pods of the old one. The old ReplicaSet is not deleted; it remains at replica 0 and serves as the foothold for a rollback.

# See the ReplicaSets and Pods owned by one Deployment at a glance
k get deploy,rs,pod -l app=web

Bound by the label selector #

The way the tiers find one another is the label selector. A ReplicaSet identifies the Pods it will manage via spec.selector.matchLabels, and that selector must match spec.template.metadata.labels. If the two diverge, the API server rejects the Deployment creation.

spec:
  selector:
    matchLabels:
      app: web        # manage Pods carrying this label as my own
  template:
    metadata:
      labels:
        app: web      # must match the selector

Another point to remember is that spec.selector is an immutable field after creation. If you need to change the selector, you have to recreate the Deployment, so it is faster to get it right the first time.

Creation and scaling #

Rather than writing a manifest by hand from scratch, an operator more often generates a skeleton with kubectl and then refines it. In the exam, too, this is overwhelmingly faster.

# Create a Deployment (replica 3, image specified)
k create deploy web --image=nginx:1.25 --replicas=3

# Extract only the YAML skeleton, edit, then apply (do = --dry-run=client -o yaml)
k create deploy web --image=nginx:1.25 $do > deploy.yaml
k apply -f deploy.yaml

There are two ways to change the replica count of a created Deployment.

# 1) kubectl scale: the fastest
k scale deploy web --replicas=5

# 2) edit the manifest and reapply: declarative, meshes with GitOps
k edit deploy web        # edit spec.replicas

When you are in a hurry, k scale is fast, but the orthodox operational approach is the declarative one of changing replicas in the manifest and reapplying. A value changed with k scale gets reverted to the manifest’s value by the next k apply, so mixing the two can make the replica count diverge from your intent.

# Check the current replicas and availability
k get deploy web
# NAME   READY   UP-TO-DATE   AVAILABLE   AGE
# web    5/5     5            5           2m

READY is the ready/desired Pod count, UP-TO-DATE is the number of Pods reflecting the latest template, and AVAILABLE is the number of available Pods. During a rollout these three values diverge, so you look at them together when judging whether an update is complete.

Rolling update #

The Deployment’s real value lies in its update strategy. The default strategy, rollingUpdate, does not take down the old Pods all at once; it brings up new Pods a few at a time while taking down old Pods a few at a time, swapping versions without service interruption.

Strategy parameters: maxSurge and maxUnavailable #

The speed and safety of a rollingUpdate are decided by two parameters.

ParameterMeaningDefault
maxSurgeThe number of Pods (or %) that may be temporarily brought up in excess of the desired replica count25%
maxUnavailableThe maximum number of Pods (or %) that may be unavailable during a rollout25%

Raising maxSurge brings up more new Pods in advance, so it is faster but temporarily uses more resources. Setting maxUnavailable to 0 keeps as many available Pods as desired at all times, so it is the safest for zero downtime, but it slows the rollout because it waits for new Pods to become Ready. When zero downtime is an absolute requirement, the maxUnavailable: 0 combination is the orthodox choice.

spec:
  replicas: 4
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1          # allow up to 5 temporarily
      maxUnavailable: 0    # always guarantee 4 available Pods

Triggering a rollout by swapping the image #

An update usually happens via an image tag change. When something in the template changes, the Deployment creates a new ReplicaSet and starts a rollout.

# Set the container image to a new tag (the most common rollout trigger)
k set image deploy/web nginx=nginx:1.26

# Or edit the manifest directly
k edit deploy web        # edit spec.template.spec.containers[].image

Here nginx= is the container name. You must write the container name, not the image name, so if you are unsure of the container name, confirm it first with k get deploy web -o jsonpath='{.spec.template.spec.containers[*].name}'.

Tracking rollout status and history #

You track an update with kubectl rollout both while it is in progress and after it is done.

# Watch the progress until the rollout finishes (exits with 0 when done)
k rollout status deploy/web

# Revision history (each revision corresponds to one ReplicaSet)
k rollout history deploy/web

# Template details of a specific revision
k rollout history deploy/web --revision=2

k rollout status finishes with exit code 0 when the rollout completes, so you can use it as a signal that waits for the update to complete. Each revision in k rollout history corresponds to one of the ReplicaSets we saw earlier.

Recording CHANGE-CAUSE: annotation instead of –record #

The CHANGE-CAUSE column of k rollout history shows why each revision was created. In the past you filled it by appending --record to the command, but that flag is deprecated. Now the recommended way is to attach the kubernetes.io/change-cause annotation directly.

# Recommended: record the change reason with an annotation
k annotate deploy/web kubernetes.io/change-cause="update nginx to 1.26"

# Verify
k rollout history deploy/web
# REVISION   CHANGE-CAUSE
# 1          <none>
# 2          update nginx to 1.26

Keeping CHANGE-CAUSE filled in lets you see at a glance which revision was what when picking a rollback target, which speeds up incident response in operations.

Rollback #

When a new version goes wrong, the first thing an operator reaches for is a rollback. Because the Deployment keeps previous ReplicaSets as revisions, you can revert in a single line.

# Roll back to the immediately preceding revision
k rollout undo deploy/web

# Roll back to a specific revision
k rollout undo deploy/web --to-revision=2

A rollback, too, proceeds with zero downtime following the rollingUpdate strategy. When you look at k rollout history after the rollback finishes, the reverted content is added as a new revision. That is, revision numbers do not decrease; they keep climbing.

pause and resume: batching changes into one go #

When you change several fields in a row, a rollout starting on every change spawns ReplicaSets in a flood. In that case you stop the rollout with pause, apply all the changes, and then do a single rollout with resume.

# Pause the rollout
k rollout pause deploy/web

# Changes in between do not roll out immediately; they pile up
k set image deploy/web nginx=nginx:1.26
k set resources deploy/web -c=nginx --limits=cpu=200m,memory=256Mi

# Apply the batched changes in a single rollout
k rollout resume deploy/web

pause/resume is also useful for a canary check, where you stop the rollout midway, verify just a portion, and then decide whether to proceed.

recreate vs rollingUpdate #

Besides rollingUpdate, there is also the Recreate strategy. Knowing the difference between the two clearly lets you separate where zero downtime is needed from where it is not.

ItemRollingUpdate (default)Recreate
Swap methodBrings up new Pods while gradually terminating old onesTerminates all old Pods first, then creates new ones
Service interruptionNone (zero downtime)Present (downtime occurs at the moment of switchover)
ResourcesTemporarily uses more during rollout (maxSurge)No extra use
Two versions coexistingBriefly coexistDo not coexist
Where to useMost stateless web/API servicesApps where two versions must not run at once (schema conflicts, etc.)

You use Recreate when the coexistence of two versions is itself a problem, such as when the old and new versions cannot share the same database schema. For other general stateless services, rollingUpdate is the default and the right answer.

Operational view: the conditions under which a zero-downtime update is guaranteed #

The rollingUpdate strategy alone does not automatically guarantee zero downtime. That is because Kubernetes has to know whether a new Pod is truly ready to receive traffic. This signal is the readinessProbe.

Without a readinessProbe, a container is considered Ready the moment it starts, so the Service sends traffic even while the app is still initializing. When rollingUpdate counts this Pod as available and takes down an old Pod, requests fail at that moment. So the actual condition for zero downtime is the combination of rollingUpdate strategy + an accurate readinessProbe + maxUnavailable: 0. Probes are covered further in #11 onward and in the troubleshooting installments.

Exam points #

  • Remember the hierarchy. A Deployment creates a ReplicaSet, and a ReplicaSet creates Pods. Check them at once with k get deploy,rs,pod.
  • The selector is immutable and must match template.labels. Get it right the first time.
  • Handle creation quickly with k create deploy --image= --replicas=, scaling with k scale deploy --replicas=.
  • Image swap is k set image deploy/<name> <container-name>=<image>. Write the container name, not the image.
  • Rollout tracking is k rollout status/history, rollback is k rollout undo [--to-revision=N]. Revision numbers do not decrease; they climb.
  • --record is deprecated. Record the change reason with the kubernetes.io/change-cause annotation.
  • When zero downtime is the requirement, recall the strategy.rollingUpdate.maxUnavailable: 0 combination.

Wrap-up #

What this post locked in:

  • The Deployment → ReplicaSet → Pod hierarchy and the label selector that binds it
  • Creation and scaling with k create deploy/k scale, and the difference between the declarative and imperative approaches
  • The rollingUpdate strategy (maxSurge/maxUnavailable), and triggering a rollout with k set image/k edit
  • Tracking versions and reverting via k rollout status/history/undo, and batching changes with pause/resume
  • The difference between recreate vs rollingUpdate and the actual condition under which a zero-downtime update is guaranteed (tied to the readinessProbe)

How a Deployment works is the foundation of Kubernetes workloads. How a controller maintains the desired state can also be reinforced from k8s basics #4, which laid the groundwork before #3 the Pod networking model.

Next — Workloads 2 #

The Deployment was a workload for stateless services. But there are more workloads in a cluster that this mold cannot hold.

In #11 Workloads 2: DaemonSet, StatefulSet, Job, CronJob, we cover the DaemonSet that brings up one Pod per node, the StatefulSet that needs stable identifiers and ordering, the Job that runs once and finishes, and the CronJob that runs periodically. We will work through, in turn, what situation each workload was designed for and how to create and operate it with kubectl.

X