Certified Kubernetes Application Developer (CKAD) #8 Deployment Strategies: Blue-green, Canary

Shipping a new version without dropping service is a basic operational skill. Cloud-managed deployment tools and service meshes automate this in sophisticated ways, but CKAD asks whether you can produce the same result on vanilla Kubernetes, where none of those tools exist — that is, the ability to build blue-green and canary by hand with nothing but Deployment, Service, and labels.

There’s only one core idea: a Service selects Pods by selector (label). By controlling how you change that selector, and which Pods end up matching it at the same time, you can steer traffic flow by hand. In this post, we’ll review Deployment rolling update and then build blue-green and canary using label switching alone.

Starting point: review of rolling update and recreate #

Before building deployment strategies by hand, let’s cover the two strategies a Deployment offers out of the box. A Deployment’s spec.strategy.type takes one of two values: RollingUpdate (the default) and Recreate.

RollingUpdate doesn’t tear down the existing Pods all at once — it brings up new Pods gradually and replaces them piece by piece. The replacement pace is controlled by maxSurge (how many extra Pods may be brought up temporarily) and maxUnavailable (how many may be unavailable at once). When you change the image, old-version and new-version Pods run briefly behind the same Service, and the transition happens with zero downtime. We covered the detailed behavior in #5 and K8s Hands-on #4.

spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0

Recreate takes down all existing Pods first, then brings up the new ones. There is downtime in between, but you use it when old and new versions must not run at the same time (for example, a single-writer database migration).

spec:
  strategy:
    type: Recreate

Rolling update is powerful but has limits: old and new versions take traffic mixed together during the swap, and traffic ratios are hard to control precisely. Blue-green and canary exist to fill those gaps. Neither is provided by Kubernetes as a separate resource — you build both by hand by combining Deployment, Service, and labels.

Blue-green: instant cutover by selector switch #

Blue-green means keeping the current version (blue) and the new version (green) up at the same time, then turning the Service’s selector to green in one move to shift all traffic at once. The two versions never mix during the swap, and if something goes wrong, you turn the selector back to blue for an instant rollback.

1) blue Deployment and Service #

First, bring up blue and create a Service that points only at those Pods. The key is putting version: blue in the selector.

# blue Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-blue
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web
      version: blue
  template:
    metadata:
      labels:
        app: web
        version: blue
    spec:
      containers:
      - name: web
        image: nginx:1.25
---
# A Service whose selector includes both labels
apiVersion: v1
kind: Service
metadata:
  name: web
spec:
  selector:
    app: web
    version: blue
  ports:
  - port: 80
    targetPort: 80

At this point the web Service selects only Pods that are both app: web and version: blue, so all traffic goes to blue.

2) Bring up the green Deployment separately #

For the new version, change only the label to version: green and bring it up as a separate Deployment. Since the Service’s selector is still blue, green receives no traffic. In this state, you validate green thoroughly.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-green
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web
      version: green
  template:
    metadata:
      labels:
        app: web
        version: green
    spec:
      containers:
      - name: web
        image: nginx:1.27

If you want to validate green by hitting it directly, create a separate temporary test Service or peek in with kubectl port-forward deploy/web-green 8080:80.

3) Cutover: switch the Service selector to green #

Once validation is done, change the Service’s selector to green. This single change shifts all traffic to green instantly. There are two ways to do it quickly in imperative style.

# Option 1: kubectl set selector
k set selector svc web 'app=web,version=green'

# Option 2: kubectl patch
k patch svc web -p '{"spec":{"selector":{"app":"web","version":"green"}}}'

Verify with k get endpoints web that the Pod IPs behind the Service have changed to the green Pods.

4) Rollback: turn the selector back to blue #

If a problem turns up in green, all you do is turn the selector back to blue. Because you left the blue Deployment in place, the rollback finishes instantly.

k patch svc web -p '{"spec":{"selector":{"app":"web","version":"blue"}}}'

Once you judge green to be stable, delete the blue Deployment to reclaim resources. The cost of blue-green is that until cutover you keep both versions running at once, doubling your resource usage. In return, cutover and rollback each require only a single selector change — the fastest of all strategies.

Canary: traffic splitting via shared label and replicas ratio #

Canary reduces risk by routing a small slice of traffic to the new version first instead of exposing it to everyone at once. Where blue-green swaps the selector wholesale for an instant switch, canary makes a single Service select both the stable and canary Pods together to divide traffic.

Core idea: simultaneous selection via a shared label #

Set the Service’s selector to only a label that the two Deployments share (for example, app: web). Then both the stable Pods and the canary Pods are bound behind the same Service. A Service distributes traffic roughly evenly across the Pods it selects, so the ratio of replica counts becomes an approximation of the traffic ratio.

1) stable Deployment and shared-selector Service #

# stable: replicas 9
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-stable
spec:
  replicas: 9
  selector:
    matchLabels:
      app: web
      track: stable
  template:
    metadata:
      labels:
        app: web
        track: stable
    spec:
      containers:
      - name: web
        image: nginx:1.25
---
# The Service selects only the shared label app: web
apiVersion: v1
kind: Service
metadata:
  name: web
spec:
  selector:
    app: web
  ports:
  - port: 80
    targetPort: 80

The key is that there’s no track in the Service’s selector. Since it looks only at app: web, anything with that label receives traffic, whether stable or canary.

2) Add a small canary Deployment #

Bring up the new version with the track: canary label, but keep the shared app: web label, at a low replica count. With 9 stable and 1 canary, that’s roughly 9:1 — only about 10% of total traffic flows to canary.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-canary
spec:
  replicas: 1
  selector:
    matchLabels:
      app: web
      track: canary
  template:
    metadata:
      labels:
        app: web
        track: canary
    spec:
      containers:
      - name: web
        image: nginx:1.27

In this state, k get endpoints web shows the Pod IPs of all 9 stable and 1 canary together. If you want to raise the traffic ratio, increase the canary’s replicas.

# Raise canary share to about 30% (stable 7 : canary 3)
k scale deploy web-canary --replicas=3
k scale deploy web-stable --replicas=7

3) Promote or discard #

If the canary’s metrics (error rate, latency) are normal, replace stable with the new version and retire the canary. Bump stable’s image, then scale canary to 0 or delete it.

# Replace stable with the new version and restore the original replicas
k set image deploy/web-stable web=nginx:1.27
k scale deploy web-stable --replicas=9
# Clean up canary
k delete deploy web-canary

If a problem is found, deleting the canary stops the exposure instantly. Since stable was left untouched, the impact on users is minimal.

Canary’s limits are clear too. Traffic splitting is only an approximation proportional to replica counts — precise routing by header or by user is not possible. When you need precise control, you reach for the Ingress canary annotations or a service mesh, but the CKAD scope stops at this replicas-based implementation.

Comparing the three strategies #

Aspectrolling updateblue-greencanary
ImplementationDeployment default strategyTwo Deployments + selector switchTwo Deployments + shared label
Extra resourcesAlmost none (as much as maxSurge)Double (both versions kept up)Slight (as much as canary replicas)
Rollback speedModerate (k rollout undo)Fastest (revert selector)Fast (delete canary)
Traffic controlNone (gradual swap)All or nothingApproximated by replicas ratio
Version mixingMixed during swapNever mixedIntentionally coexist
Validation opportunityLittlePlenty before cutoverGradual with a small slice

The three strategies aren’t a ranking — they’re a situational choice. Rolling update for a low-resource, routine swap; blue-green when fast cutover and instant rollback matter; canary when you want to validate risk gradually.

Exam points #

For CKAD, it’s no exaggeration to say this topic is all about the selector switch.

  • Blue-green cutover is a change to the Service’s selector. Get k set selector svc <name> 'app=web,version=green' or k patch svc into your fingers and it’s done in seconds.
  • Rollback is reverting the selector. Leaving the blue Deployment in place is the precondition for rollback.
  • Canary is about setting the Service selector to only the shared label and making the two Deployments share that label. If you put a distinguishing label like track into the selector, the split won’t happen.
  • Adjust the traffic ratio with k scale deploy ... --replicas=N. Just remember that 9:1 is about 10%.
  • Always verify with k get endpoints <svc> and k get pods --show-labels. The habit of visually confirming which Pods are bound behind the Service prevents wrong answers.

If label and selector behavior trips you up, it helps to revisit how a Service finds Pods in K8s Hands-on #5.

Wrap-up #

What this post locked in:

  • A deployment strategy is not a separate resource but a combination of Deployment + Service + label. CKAD asks for the ability to implement it by hand, with no managed tooling.
  • rolling update (default, gradual swap) and recreate (take all down, then bring up) are provided as a Deployment’s strategy.
  • blue-green brings up both versions at once and switches the Service selector in one move for an instant cutover. Rollback is the fastest, via reverting the selector.
  • canary has the Service select both stable and canary via a shared label and approximately splits traffic by the replicas ratio.
  • Validation of every strategy starts with confirming which Pods are bound, using k get endpoints and --show-labels.

Next — Helm #

So far we’ve built manifests by hand and manipulated them imperatively. In practice, though, you have to deploy the same app repeatedly, slightly different per environment. In #9 Helm: install, upgrade, rollback, values, we’ll cover Helm, which bundles manifests into templates, injects per-environment differences through values, and handles install, upgrade, and rollback at the package level.

X