Certified Kubernetes Administrator (CKA) #6 Cluster Upgrade: kubeadm upgrade plan/apply, Per-node drain

If #5 HA Cluster raised availability with multiple control planes and external etcd, this post covers the standard procedure for bumping that cluster up one minor version. It’s a staple task that almost never goes missing from the CKA hands-on exam. The procedure itself is fixed, so once you memorize the order and get it into your hands, it’s an area where you reliably bank points.

The reason upgrades are tricky isn’t that the commands are hard — it’s that breaking the order by even one step shakes the cluster. Control plane first, then workers. One minor version at a time. And upgrade kubeadm first, kubelet and kubectl after. Engraving these three principles into your muscle memory is the goal of this post.

The three principles of upgrading #

Before memorizing commands, let’s lock in the principles. Break any of these three and no command will save you.

1) Control plane first, workers last. The control plane components (apiserver, controller-manager, scheduler) must be at the same version as, or higher than, the workers’ kubelet. That’s why you always upgrade the control plane nodes first, and only after that’s done do you bring up the worker nodes one by one.

2) One minor version at a time. Kubernetes does not support skipping two steps, from 1.30 to 1.32 in one go. You must go through minor versions one at a time, like 1.30 → 1.31 → 1.32. Patch versions (e.g., 1.31.0 → 1.31.4) are within the same minor, so you can bump those freely.

3) Within a single node, kubeadm → kubelet/kubectl. There’s an order even when you bring up a single node. First swap the kubeadm package for the new version, upgrade the components with kubeadm upgrade, then finally upgrade the kubelet and kubectl packages and restart kubelet.

PrincipleDetail
Node orderControl plane first → worker nodes
Version jumpOne minor version at a time (no skipping)
Package orderkubeadm → (upgrade apply/node) → kubelet/kubectl

Upgrading the control plane #

Let’s start with the control plane node. As an example, say we’re going from 1.31 to 1.32. First, connect to the control plane node.

ssh controlplane

1) Check the upgrade plan #

kubeadm upgrade plan shows the current version and the versions you can move to. It prints a table of which component goes to which version, so be sure to check it before any real work.

kubeadm upgrade plan

2) Swap the kubeadm package #

First you have to enable the minor version you’ll use in the package repository. Since Kubernetes 1.28, repositories are split per minor version, so you have to change the version part of /etc/apt/sources.list.d/kubernetes.list to the new minor.

# Switch the repository to the new minor version (1.31 → 1.32)
sed -i 's/v1.31/v1.32/' /etc/apt/sources.list.d/kubernetes.list

# Refresh the package list
apt update

# Check the exact installable version
apt-cache madison kubeadm

# Swap kubeadm to the target version
apt-mark unhold kubeadm
apt install -y kubeadm=1.32.0-1.1
apt-mark hold kubeadm

apt-mark hold is a lock that prevents a package from being upgraded automatically by accident. Get into the habit of releasing it with unhold right before the upgrade and locking it again with hold right after.

3) Upgrade the control plane components with kubeadm #

Now actually bring up the control plane components with the swapped kubeadm.

# Check the swapped kubeadm version
kubeadm version

# Run the control plane component upgrade
kubeadm upgrade apply v1.32.0

kubeadm upgrade apply replaces the static Pod manifests of apiserver, controller-manager, and scheduler with new-version images, and bumps etcd along with them when needed. This step is the heart of the control plane upgrade. On an HA cluster with multiple control plane nodes, run kubeadm upgrade node on the remaining control plane nodes (apply is only on the first node).

4) Swap kubelet and kubectl #

Once you’ve brought up the control plane components, bring up the kubelet and kubectl on the same node too. At this point it’s safer to first drain the control plane node to clear its workloads (k drain controlplane --ignore-daemonsets).

# Swap kubelet and kubectl
apt-mark unhold kubelet kubectl
apt install -y kubelet=1.32.0-1.1 kubectl=1.32.0-1.1
apt-mark hold kubelet kubectl

# Reload kubelet config, then restart
systemctl daemon-reload
systemctl restart kubelet

# Allow scheduling again
k uncordon controlplane

You have to do systemctl daemon-reload followed by systemctl restart kubelet for the new-version kubelet to actually come up. If you skip the restart, the package is upgraded but the running kubelet stays as-is, so the version isn’t reflected. The node you drained gets reverted at the end with k uncordon.

Upgrading the worker nodes #

Once the control plane is done, bring up the worker nodes one by one. The principle is to drain, upgrade, and revert one node at a time. If you drain several workers at once, there won’t be enough nodes left to take the workloads.

1) drain the node #

Move the Pods on the node you’re about to upgrade to other nodes, and block new Pods from coming in. This command is usually run on the control plane node or wherever kubectl is configured.

k drain node01 --ignore-daemonsets --delete-emptydir-data

Be sure to remember these two options.

  • --ignore-daemonsets: DaemonSet Pods must run one per node, so drain can’t move them. Without this option, drain is refused and stops.
  • --delete-emptydir-data: If there’s a Pod using an emptyDir volume, drain is refused with a warning that its local data will be lost. You must explicitly consent — “it’s fine if the data is lost” — with this option for it to proceed.

2) Upgrade kubeadm and kubelet on the node #

Now connect to that worker node and bring up the packages the same way as on the control plane.

ssh node01

# Switch the repository to the new minor version
sed -i 's/v1.31/v1.32/' /etc/apt/sources.list.d/kubernetes.list
apt update

# Swap kubeadm
apt-mark unhold kubeadm
apt install -y kubeadm=1.32.0-1.1
apt-mark hold kubeadm

# Upgrade the worker node's kubelet config
kubeadm upgrade node

On a worker node you use kubeadm upgrade node, not kubeadm upgrade apply. This command doesn’t touch the control plane components; it only updates that node’s kubelet config to match the new version. Confusing apply with node is a staple mistake.

# Swap kubelet and kubectl
apt-mark unhold kubelet kubectl
apt install -y kubelet=1.32.0-1.1 kubectl=1.32.0-1.1
apt-mark hold kubelet kubectl

# Restart kubelet
systemctl daemon-reload
systemctl restart kubelet

3) uncordon the node #

Once the upgrade is done, revert the node back to being a scheduling target.

k uncordon node01

Skip this command and the node is upgraded but stays in the SchedulingDisabled state, unable to take new Pods. Memorize the pairing: a drained node must be finished off with uncordon. If there are several workers, repeat steps 1–3 identically for node02 and node03.

The difference between cordon, drain, and uncordon #

These three commands are easy to mix up, so let’s sort them out in one table.

CommandNew Pod schedulingExisting PodsUse
k cordon <node>BlockedLeft as-isMark a node so only new Pods can’t come in
k drain <node>BlockedEvicted to other nodesEmpty a node before an upgrade/maintenance
k uncordon <node>AllowedNo effectPut a finished node back into use

The key is that drain includes cordon. Internally, drain cordons the node to block new Pods, then moves the existing Pods to other nodes. So you don’t need a separate cordon after drain, and once the work is done you just need to uncordon.

Verification #

Whether the upgrade went through is confirmed by the VERSION column of k get nodes.

k get nodes
NAME           STATUS   ROLES           AGE   VERSION
controlplane   Ready    control-plane   90d   v1.32.0
node01         Ready    <none>          90d   v1.32.0
node02         Ready    <none>          90d   v1.32.0

If every node’s STATUS is Ready and the VERSION is unified to the target version (v1.32.0), it’s a success. If some node stays at SchedulingDisabled, you skipped the uncordon; if a VERSION wasn’t updated, you skipped the kubelet restart.

The version of the control plane components themselves can also be confirmed via the static Pods.

# Check the apiserver Pod's image tag
k get pod -n kube-system kube-apiserver-controlplane \
  -o jsonpath='{.spec.containers[0].image}'

Traps people miss #

Here’s a collection of the patterns that cost you points in the exam.

  • Missing drain options. Drain without --ignore-daemonsets and it stalls on DaemonSet Pods. Drain without --delete-emptydir-data and it’s refused with a local-data warning. Make it a habit to always attach both options.
  • Attempting a version jump. Try to go from 1.30 to 1.32 in one go and kubeadm upgrade refuses. You must go through one minor at a time.
  • Confusing apply with node. The first control plane is kubeadm upgrade apply <ver>; the remaining control planes and the workers are kubeadm upgrade node.
  • Missing kubelet restart. Upgrade only the package and skip systemctl restart kubelet, and the VERSION won’t be updated.
  • Missing uncordon. Don’t revert a drained node and it won’t take new Pods.
  • Reversed order. Bring up a worker before the control plane and its kubelet ends up higher than the control plane, leaving an incompatible state.

Exam points #

  • The upgrade order is control plane → workers, one minor version at a time, and within a node kubeadm → kubelet/kubectl.
  • The control plane goes kubeadm upgrade plan → swap packages → kubeadm upgrade apply <ver> → swap kubelet/kubectl → systemctl restart kubelet.
  • The worker goes k drain <node> --ignore-daemonsets --delete-emptydir-data → swap packages → kubeadm upgrade node → swap kubelet/kubectl → restart → k uncordon <node>.
  • On HA, additional control plane nodes are brought up with kubeadm upgrade node, not apply.
  • Since drain includes cordon, once you’re done you just pair it with uncordon.
  • Verification is done with the STATUS and VERSION columns of k get nodes.

Wrap-up #

What this post locked in:

  • The three principles. Control plane first, one minor version at a time, kubeadm first within a node.
  • Control plane. Check with kubeadm upgrade plan, swap kubeadm and bring up the components with kubeadm upgrade apply <ver>, then swap kubelet/kubectl and restart.
  • Worker. Empty it with drain, bring it up by swapping packages and running kubeadm upgrade node, then restart and revert with uncordon.
  • cordon/drain/uncordon. Drain includes cordon; once done, finish off with uncordon.
  • Traps. The two drain options, no version jump, distinguishing apply from node, watch out for missing the restart and uncordon.

Next — etcd Backup and Restore #

With the upgrade, we’ve covered the cluster’s lifecycle. Next is the job of protecting etcd, which holds all of the cluster’s state.

In #7 etcd Backup and Restore, we’ll take a snapshot with etcdctl snapshot save, then assume a failure scenario and revert the cluster state with etcdctl snapshot restore. We’ll follow along by hand all the way through specifying certificate paths and the endpoint, and changing the static Pod manifest after restore so it points at the new data directory.

X