Certified Kubernetes Administrator (CKA) #4 Installing a Cluster with kubeadm: Bootstrapping a Single Control Plane
In #3 The Node and Pod Networking Model we worked through how kubelet, kube-proxy, and the CRI mesh together on a node. Now it’s time to actually install those components and turn a bare Linux machine into a Kubernetes cluster. A managed cloud cluster hides the control plane from you, but CKA tests your ability to stand that control plane up yourself. In this post we’ll bootstrap a single-control-plane cluster from scratch with kubeadm.
kubeadm is the official bootstrapping tool for standing up a cluster. It brings up the apiserver, etcd, scheduler, and controller-manager as static Pods, issues certificates, and creates the token a worker uses to join. The tasks that show up most often on the exam are exactly this kubeadm init and kubeadm join, so we’ll drill every command into our hands precisely.
The big picture: what kubeadm does #
Standing up a cluster with kubeadm breaks down per machine into the following.
- Common prerequisites on every node. Disable swap, load kernel modules, set sysctl, install the container runtime, install kubeadm/kubelet/kubectl
- Bootstrap on the control plane node.
kubeadm initbrings up the apiserver, etcd, scheduler, and controller-manager as static Pods - Install a CNI. A Pod network plugin must be installed before the node goes Ready and Pods can talk to each other
- Join the worker nodes. Each worker joins the control plane with
kubeadm join
The control plane and the workers are prepared identically through step 1. Roles diverge from step 2 onward. Let’s follow each one command by command.
1. Prerequisites (common to every node) #
Whether it’s the control plane or a worker, you do the same preparation. If even one step is missing, kubeadm init or kubelet fails to start, so we’ll go through them in order.
Disable swap #
By default, kubelet refuses to start if swap is on. This is to make behavior under memory pressure predictable. First turn it off for the current session, then comment out the swap line in /etc/fstab so it stays off after a reboot.
# turn swap off immediately
sudo swapoff -a
# keep it off after reboot (comment out the swap entry in fstab)
sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
# verify (output should be empty)
swapon --show
free -hKernel modules: overlay and br_netfilter #
You need overlay for the container runtime’s overlay filesystem, and br_netfilter to make bridged traffic pass through iptables. Write them into a config file so they load automatically after a reboot, and load them into the current session right now.
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilter
# verify the modules loaded
lsmod | grep -E 'overlay|br_netfilter'sysctl: bridged traffic and IP forwarding #
With br_netfilter loaded, make traffic crossing a bridge go through iptables rules and turn on IP forwarding on the node. These are the preconditions for the CNI and kube-proxy to work correctly.
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
# apply without a reboot
sudo sysctl --system
# verify (all three should be 1)
sysctl net.ipv4.ip_forward net.bridge.bridge-nf-call-iptablesContainer runtime: containerd #
As we saw in #3, Kubernetes talks to the runtime through the CRI. Install the most widely used containerd and configure it to use the same cgroup driver (systemd) as kubelet. This cgroup driver mismatch is the most common cause of failure for beginners.
# install containerd (distro package or official binary)
sudo apt-get update && sudo apt-get install -y containerd
# generate the default config
sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml >/dev/null
# use the systemd cgroup driver (match kubelet)
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
# restart and enable on boot
sudo systemctl restart containerd
sudo systemctl enable containerdInstall kubeadm, kubelet, kubectl #
Install the three packages from the official repository. Holding their versions prevents an unintended automatic upgrade from shaking the cluster. Upgrades are covered deliberately in #6.
# packages needed to use the repository
sudo apt-get install -y apt-transport-https ca-certificates curl gpg
# add the Kubernetes apt repository key and source (match the version to your exam date)
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.31/deb/Release.key \
| sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.31/deb/ /' \
| sudo tee /etc/apt/sources.list.d/kubernetes.list
# install, then hold the versions
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl
# enable kubelet on boot (before init, a crashloop state is normal)
sudo systemctl enable --now kubeletThis is where the common-to-every-node part ends. Worker nodes stop here, and the control plane node moves on to the next step.
2. Bootstrapping the control plane: kubeadm init #
On one control plane node, run kubeadm init. This single command issues certificates, brings up the control plane components as static Pods, starts etcd, and creates the token a worker will use to join.
sudo kubeadm init \
--pod-network-cidr=192.168.0.0/16 \
--apiserver-advertise-address=10.0.0.10Let’s pin down what the two options mean.
--pod-network-cidr. The IP range to allocate to Pods. It must match the range the CNI you install later expects. Calico’s default is192.168.0.0/16, Flannel’s default is10.244.0.0/16.--apiserver-advertise-address. The node IP the apiserver advertises. If the node has multiple interfaces, specifying it is the safer choice.
When kubeadm init succeeds, it prints two things at the end. You need to copy both exactly as they are.
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 10.0.0.10:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:1234...cdefConfigure kubeconfig #
You must run the first block of that output exactly as shown so kubectl can connect to the cluster. It copies admin.conf to a regular user’s ~/.kube/config.
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# verify the connection (one control plane node line should appear)
kubectl get nodesAt this point, kubectl get nodes shows one control plane node, but its status is NotReady. That’s because you haven’t installed a CNI yet, and it’s normal.
3. Installing a CNI: bringing the node to Ready #
Kubernetes does not ship a Pod network plugin (CNI) itself. You have to install one separately before the node goes Ready and Pods can talk to each other. Until the CNI is installed, even the CoreDNS Pods stay Pending.
Calico install example #
# apply the Calico manifest
kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.28.0/manifests/calico.yaml
# watch the CNI Pods become Running
kubectl get pods -n kube-system -wFlannel install example #
If you use Flannel, you need to have run kubeadm init with --pod-network-cidr=10.244.0.0/16. If the manifest and the CIDR don’t line up, Pods won’t get IPs.
kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.ymlOnce all the CNI Pods are Running, the node status flips from NotReady to Ready.
# it switches to Ready after a moment
kubectl get nodes4. Joining worker nodes: kubeadm join #
On each worker node, run the kubeadm join command from the kubeadm init output as root. The worker, too, must have all of the step 1 prerequisites done.
sudo kubeadm join 10.0.0.10:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:1234...cdefThe three pieces of the command each build a trust relationship.
10.0.0.10:6443. The address and port of the apiserver the worker connects to.--token. The bootstrap token. The worker uses it to prove itself. Mind the expiry — the default lifetime is 24 hours.--discovery-token-ca-cert-hash. The hash of the control plane’s CA certificate. The worker uses it to verify that the apiserver it connected to really is our cluster.
When you’ve lost the token or it has expired #
If you didn’t capture the kubeadm init output or 24 hours have passed, issue a fresh join command on the control plane node. This single line prints a complete join command carrying both the token and the CA hash.
# on the control plane node: reissue the complete join command
kubeadm token create --print-join-commandYou can also create just a token, or check the currently live tokens.
# create only a new token
kubeadm token create
# list currently valid tokens
kubeadm token list5. Verification #
Once the join is done, check the whole cluster’s state on the control plane node. In CKA you must always go through this verification after a task so you don’t lose points.
# whether all nodes are Ready
k get nodes
# whether the control plane components, CNI, and CoreDNS are Running
k get pods -n kube-systemIf all is well, the control plane and the workers all show Ready, and the apiserver, etcd, scheduler, controller-manager, kube-proxy, CoreDNS, and CNI Pods in the kube-system namespace all show Running.
NAME STATUS ROLES AGE VERSION
controlplane Ready control-plane 8m v1.31.0
node01 Ready <none> 3m v1.31.0To check the state of kubelet itself inside a node, use systemctl and journalctl. These are the commands we set up in #1.
systemctl status kubelet
journalctl -u kubelet -fCommon pitfalls #
The patterns that cost you points during install work are mostly fixed.
- Node NotReady because no CNI was installed. A node being
NotReadyright afterkubeadm initis normal, but without a CNI it staysNotReadyforever. CoreDNS also staysPending. If you’ve done the whole install and the node won’t go Ready, the first thing to check is the CNI Pods withkubectl get pods -n kube-system. - kubelet won’t come up because swap is on.
kubeadm initstops at the preflight check with a swap error. This is when you skippedswapoff -aor didn’t handle/etc/fstabafter a reboot. - Token expired. Since the bootstrap token’s lifetime is 24 hours, trying to attach a worker after that time fails authentication. Reissue with
kubeadm token create --print-join-command. - cgroup driver mismatch. If containerd uses cgroupfs and kubelet uses systemd — different cgroup drivers — kubelet can’t bring up containers. Check containerd’s
SystemdCgroup = truesetting. - pod-network-cidr and CNI mismatch. If the CIDR given to
kubeadm initand the range the CNI manifest expects don’t line up, Pods don’t get IPs. Calico defaults to192.168.0.0/16, Flannel to10.244.0.0/16. - CRI socket not specified. If multiple runtimes are installed, you have to specify it, as in
kubeadm init --cri-socket unix:///run/containerd/containerd.sock.
To undo a botched init #
When you’ve misconfigured something and want to start over from scratch, kubeadm reset returns the node to its initial state.
sudo kubeadm reset -f
sudo rm -rf /etc/cni/net.d $HOME/.kube/configExam points #
kubeadm initbrings up the control plane as static Pods. The apiserver, etcd, scheduler, and controller-manager manifests live in/etc/kubernetes/manifests/.- Capture both blocks of the output right after
kubeadm init(the kubeconfig setup and the join command). If you missed the join command, reissue it withkubeadm token create --print-join-command. - If a node is
NotReady, the first thing to suspect is whether the CNI is installed. Without a CNI, a node never goes Ready. - If any of the four prerequisites (swap off, kernel modules, sysctl, container runtime) is missing, init or kubelet fails to start.
- Match
--pod-network-cidrto the default range of the CNI you’ll install. - Track kubelet state with
systemctl status kubeletandjournalctl -u kubelet.
Wrap-up #
What this post locked in:
- Prerequisites (every node). Disable swap, kernel modules (overlay,br_netfilter), sysctl (ip_forward,bridge-nf), install containerd with systemd cgroup, install kubeadm/kubelet/kubectl
- Bootstrap the control plane.
kubeadm init --pod-network-cidr=..., secure the kubeconfig setup (mkdir -p ~/.kube; cp ...) and the join command from the output - Install a CNI. You have to apply the Calico or Flannel manifest for the node to go Ready
- Join workers.
kubeadm join(token + discovery hash), and on expiry reissue withkubeadm token create --print-join-command - Verification and pitfalls. Check with
k get nodesandk get pods -n kube-system, and inspect NotReady, swap, token expiry, and cgroup mismatch
Next: HA clusters #
We’ve stood up a single-control-plane cluster. But if there’s only one control plane node, the moment that one node dies, cluster management stops.
In #5 HA Clusters: Multiple control planes, external etcd cluster, we’ll cover the structure that secures availability by scaling the control plane to several nodes. We’ll work through how to put a load balancer in front of the apiserver, how to give kubeadm init a --control-plane-endpoint so additional control planes can join, and the difference between a stacked setup that keeps etcd alongside the control plane and an external etcd setup that separates it into its own cluster.