Certified Kubernetes Administrator (CKA) #27 Full-Length Practice Exam — 17 Tasks with Solutions
From #1 the exam environment through #26 exam tips, we have circled every domain once. The final post of this series is not one you read but one you solve. Just like the real CKA, it gathers 17 tasks that integrate every domain in one place. These are not multiple choice — they are hands-on scenarios where you build and fix a cluster directly in an empty terminal and on the nodes, and each task carries a point value.
The recommended time limit is 2 hours, the same as the real exam. The pass line is 66%, scored by summing the point values of all 17 tasks. If you get stuck on a task, mark it, move on, and bank points from the high-value tasks you have a feel for first — that is the way over the pass line.
Because CKA has multiple clusters, making context switching the very first thing you do prevents wrong answers. Each task is only graded if you solve it in the specified context, and since many tasks take you inside the nodes, you also need a feel for SSH, systemctl, and etcdctl. For each task, solve it fully on your own first, then unfold the solution. If you read the solution first, your hands never learn it.
How to take it #
- Solving on a multi-node cluster built with kubeadm is closest to the real thing. If that is hard locally, stand up one control plane and one worker on two or three cloud VMs. etcd backup/restore and node troubleshooting don’t build the right feel on a single-node minikube.
- For each task, switch to the specified context first. As this series has repeated, a misconfigured context scores 0 even if your answer is correct.
k config use-context <the context the question specifies>- Some tasks have you SSH into the nodes, so check ahead of time that you can connect using the hostnames the exam presents (such as
node01). If you need root on a node, switch withsudo -i. - Solve all 17 to the end, then unfold the solutions and grade them in one pass. Peeking at solutions mid-exam dulls your sense of the real thing. Applying the
alias k=kubectlandexport do="--dry-run=client -o yaml"setup from #1 first will save you time.
Domain distribution #
The 17 tasks are arranged to match the domain weights of the real CKA. Troubleshooting is the largest at 30%, so it has the most tasks too.
| # | Domain | Tasks | Task numbers |
|---|---|---|---|
| 1 | Cluster Architecture, Installation, Configuration | 5 | 1, 2, 3, 4, 5 |
| 2 | Workloads and Scheduling | 3 | 6, 7, 8 |
| 3 | Services and Networking | 3 | 9, 10, 11 |
| 4 | Storage | 2 | 12, 13 |
| 5 | Troubleshooting | 4 | 14, 15, 16, 17 |
The points reflect the domain weights and task difficulty, totaling 100. The scoring criteria are laid out at the end of the post.
Task 1 (8 points): Cluster Architecture, Installation, Configuration #
In context cluster1, save a snapshot of the running etcd to /opt/etcd-backup.db. Work by SSHing into the control plane node, where etcd runs as a static Pod and the certificates live under /etc/kubernetes/pki/etcd.
Solution
Connect to the control plane node, switch to root, and save the snapshot.
ssh cluster1-controlplane
sudo -i
ETCDCTL_API=3 etcdctl snapshot save /opt/etcd-backup.db \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.keyOnce it finishes, check the snapshot status.
ETCDCTL_API=3 etcdctl --write-out=table snapshot status /opt/etcd-backup.dbExplanation: snapshot save connects to etcd with a client certificate, so all three of --cacert, --cert, and --key are required, and you can find the paths in the --cert-file, --key-file, and --trusted-ca-file values of the etcd static Pod manifest (/etc/kubernetes/manifests/etcd.yaml). The most common trap is dropping ETCDCTL_API=3, which makes the command run against the v2 API and fail.
Task 2 (8 points): Cluster Architecture, Installation, Configuration #
Restore the snapshot /opt/etcd-backup.db from Task 1 into a new data directory /var/lib/etcd-restore, and edit the manifest so the etcd static Pod uses this directory. After the restore, confirm the cluster returns to normal.
Solution
Restore the snapshot into the new directory.
ETCDCTL_API=3 etcdctl snapshot restore /opt/etcd-backup.db \
--data-dir=/var/lib/etcd-restoreIn the etcd static Pod manifest, change the host path to the new directory.
vim /etc/kubernetes/manifests/etcd.yaml volumes:
- name: etcd-data
hostPath:
path: /var/lib/etcd-restore
type: DirectoryOrCreateWait for kubelet to detect the manifest change and bring the etcd Pod back up, then verify.
crictl ps | grep etcd
k get nodesExplanation: snapshot restore only creates a new data directory — it does not restart etcd. The actual switch happens at the step where you change hostPath.path in the static Pod manifest to the new directory, making kubelet recreate the etcd Pod against the new data. Watch out for the permissions on the parent path of --data-dir and for the brief moment the existing etcd container goes down.
Task 3 (8 points): Cluster Architecture, Installation, Configuration #
Upgrade the control plane and worker node of context cluster1 from v1.31.0 to v1.31.1. Upgrade the control plane first, then upgrade the worker node01. The worker must be emptied of workloads during its upgrade.
Solution
On the control plane node, bump kubeadm, then check and apply the upgrade plan.
ssh cluster1-controlplane
sudo -i
apt-get update && apt-get install -y kubeadm=1.31.1-1.1
kubeadm upgrade plan
kubeadm upgrade apply v1.31.1Bump the control plane’s kubelet and kubectl, then restart it with drain,uncordon.
kubectl drain cluster1-controlplane --ignore-daemonsets
apt-get install -y kubelet=1.31.1-1.1 kubectl=1.31.1-1.1
systemctl daemon-reload && systemctl restart kubelet
kubectl uncordon cluster1-controlplaneDrain the worker from the control plane, then upgrade it on the node itself.
kubectl drain node01 --ignore-daemonsets
ssh node01
sudo -i
apt-get update && apt-get install -y kubeadm=1.31.1-1.1
kubeadm upgrade node
apt-get install -y kubelet=1.31.1-1.1
systemctl daemon-reload && systemctl restart kubelet
exit
kubectl uncordon node01Explanation: The key difference is using kubeadm upgrade apply for the control plane and kubeadm upgrade node for the worker. The upgrade order is install kubeadm → upgrade → install kubelet/kubectl → restart kubelet, and draining without --ignore-daemonsets gets blocked by DaemonSet Pods. Version jumps of only one minor at a time are allowed.
Task 4 (8 points): Cluster Architecture, Installation, Configuration #
Create a ServiceAccount deployer that operates only in namespace dev, and configure RBAC so this ServiceAccount can create, get, update, and delete Deployments within dev. Verify the permissions are correct with kubectl auth can-i.
Solution
Create the ServiceAccount, Role, and RoleBinding.
k -n dev create serviceaccount deployer
k -n dev create role deploy-manager \
--verb=create,get,list,update,delete \
--resource=deployments.apps
k -n dev create rolebinding deployer-binding \
--role=deploy-manager \
--serviceaccount=dev:deployerVerify the permissions.
k -n dev auth can-i create deployments --as=system:serviceaccount:dev:deployer
k -n dev auth can-i delete deployments --as=system:serviceaccount:dev:deployerExplanation: A Role is a namespace-scoped set of permissions, and a RoleBinding connects that Role to a subject (here, the ServiceAccount). If you don’t spell out the API group as in --resource=deployments.apps, it gets captured under the core group and can end up an empty rule. When verifying, the subject must be written in the form system:serviceaccount:<ns>:<name>, and both commands should return yes.
Task 5 (6 points): Cluster Architecture, Installation, Configuration #
Check the expiry date of the certificate /etc/kubernetes/pki/apiserver.crt, and renew all cluster certificates with kubeadm. After renewal, check again that the expiry date has moved into the future.
Solution
On the control plane node, check the certificate expiry status.
ssh cluster1-controlplane
sudo -i
kubeadm certs check-expirationTo see a specific certificate’s expiry date directly, check it with openssl.
openssl x509 -in /etc/kubernetes/pki/apiserver.crt -noout -enddateRenew all certificates, then restart the control plane static Pods.
kubeadm certs renew all
systemctl restart kubeletExplanation: kubeadm certs check-expiration shows the expiry dates of each certificate and kubeconfig in a table. certs renew all only reissues the certificate files, so you must restart the components — apiserver and the other static Pods — to make them read the new certificates and actually take effect. kubeadm auto-renews certificates during a control plane upgrade, so a regular upgrade can double as a renewal.
Task 6 (6 points): Workloads and Scheduling #
In namespace apps, create a DaemonSet log-agent that runs exactly one fluentd log collector on every node. Use the image fluent/fluentd:v1.16, and it does not need to run on the control plane node because of its taint.
Solution
Since this is a kind whose skeleton you can’t build with dry-run, write the manifest by hand.
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: log-agent
namespace: apps
spec:
selector:
matchLabels:
app: log-agent
template:
metadata:
labels:
app: log-agent
spec:
containers:
- name: fluentd
image: fluent/fluentd:v1.16k apply -f log-agent.yaml
k -n apps get ds log-agentExplanation: A DaemonSet places one Pod on each worker node and has no replicas field. The control plane node usually has a node-role.kubernetes.io/control-plane:NoSchedule taint, so without a separate toleration nothing schedules there — which means the requirement that it “does not need to run there” is satisfied by simply not adding a toleration. Unlike a Deployment, the selector and template labels must match.
Task 7 (7 points): Workloads and Scheduling #
Add the taint gpu=true:NoSchedule to the worker node node01, and in namespace apps create a Pod ml-job (image nginx) that tolerates this taint and schedules only onto nodes with the label disktype=ssd.
Solution
Add the taint to the node and check that the label is present.
k taint node node01 gpu=true:NoSchedule
k label node node01 disktype=ssdCreate a Pod with a toleration and nodeAffinity.
apiVersion: v1
kind: Pod
metadata:
name: ml-job
namespace: apps
spec:
tolerations:
- key: gpu
operator: Equal
value: "true"
effect: NoSchedule
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: disktype
operator: In
values:
- ssd
containers:
- name: ml-job
image: nginxk apply -f ml-job.yaml
k -n apps get pod ml-job -o wideExplanation: A taint is the mechanism by which a node repels Pods, and a toleration is a Pod’s declaration that it will tolerate that taint. With only a toleration the Pod can still land on other nodes, so to pin it to a specific node you must add nodeAffinity or nodeSelector as well. The toleration’s value must be quoted as a string to avoid a "true" boolean-interpretation error.
Task 8 (7 points): Workloads and Scheduling #
In namespace data, create a StatefulSet cache. Use the image redis:7 with replicas 3, and give the Pods stable DNS names through a headless Service cache. Each Pod has a 1Gi PVC via volumeClaimTemplates.
Solution
Define the headless Service and the StatefulSet together.
apiVersion: v1
kind: Service
metadata:
name: cache
namespace: data
spec:
clusterIP: None
selector:
app: cache
ports:
- port: 6379
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: cache
namespace: data
spec:
serviceName: cache
replicas: 3
selector:
matchLabels:
app: cache
template:
metadata:
labels:
app: cache
spec:
containers:
- name: redis
image: redis:7
ports:
- containerPort: 6379
volumeMounts:
- name: data
mountPath: /data
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gik apply -f cache.yaml
k -n data get pods -l app=cacheExplanation: A StatefulSet must be tied to a headless Service (clusterIP: None) via serviceName to grant stable DNS in the form cache-0.cache.data.svc.cluster.local. volumeClaimTemplates automatically creates a separate PVC per Pod, and even when a Pod is recreated it reattaches to the same PVC. Pods being named sequentially from 0 is another difference from a Deployment.
Task 9 (6 points): Services and Networking #
In namespace net, create a Deployment web with the nginx image and replicas 2, and expose it through a NodePort Service web-np accessible from outside the cluster on the node’s port 30080. The Service port is 80 and the target port is 80.
Solution
k -n net create deploy web --image=nginx --replicas=2
k -n net expose deploy web --name=web-np --type=NodePort --port=80 --target-port=80Pin the nodePort value to 30080.
k -n net patch svc web-np --type='json' \
-p='[{"op":"replace","path":"/spec/ports/0/nodePort","value":30080}]'Explanation: When you create a NodePort with expose, nodePort is auto-assigned from the 30000〜32767 range, so to pin it to a specific value you either patch it or write nodePort: 30080 directly into the dry-run YAML and apply. Verify with curl <node IP>:30080 to confirm a response.
Task 10 (6 points): Services and Networking #
In namespace net, create an Ingress web-ing. Route traffic arriving at host web.example.com path / to port 80 of the Service web-np from Task 9, using Prefix for pathType and nginx for ingressClassName.
Solution
k -n net create ingress web-ing \
--class=nginx \
--rule="web.example.com/*=web-np:80" $do > ing.yamlThe generated manifest looks like this.
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: web-ing
namespace: net
spec:
ingressClassName: nginx
rules:
- host: web.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: web-np
port:
number: 80k apply -f ing.yamlExplanation: The --rule of create ingress takes the form host/path=service:port, and the trailing /* of the path converts to pathType: Prefix. Drop --class=nginx and ingressClassName is left empty, so routing won’t work on a cluster with no default IngressClass. Actual operation requires the ingress-nginx controller, but grading looks at the correctness of the resource definition.
Task 11 (6 points): Services and Networking #
In namespace net, create a NetworkPolicy db-allow. Allow ingress traffic to Pods with the label app=db only when a Pod with the label role=api in the same namespace accesses TCP port 5432, and block all other ingress.
Solution
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: db-allow
namespace: net
spec:
podSelector:
matchLabels:
app: db
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
role: api
ports:
- protocol: TCP
port: 5432k apply -f db-allow.yamlExplanation: podSelector selects the target Pods the policy applies to (app=db), and ingress.from selects the allowed sources (role=api). Once even one NetworkPolicy applies to a Pod, all traffic not explicitly listed is blocked, so other ingress is shut off without a separate deny-all. Putting from and ports in the same rule makes it an AND of both conditions, and it is enforced only on a policy-capable CNI such as Calico.
Task 12 (8 points): Storage #
In namespace storage, create a 2Gi PersistentVolume pv-data (accessMode ReadWriteOnce, reclaimPolicy Retain) backed by the host path /mnt/data, then create a 1Gi PVC pvc-data that binds to it exactly, and finally create a Pod pv-user (nginx) that mounts this PVC at /usr/share/nginx/html.
Solution
Define the PV, PVC, and Pod in order.
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-data
spec:
capacity:
storage: 2Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
hostPath:
path: /mnt/data
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-data
namespace: storage
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
apiVersion: v1
kind: Pod
metadata:
name: pv-user
namespace: storage
spec:
containers:
- name: nginx
image: nginx
volumeMounts:
- name: data
mountPath: /usr/share/nginx/html
volumes:
- name: data
persistentVolumeClaim:
claimName: pvc-datak apply -f pv.yaml
k -n storage get pvc pvc-dataExplanation: For a PVC to bind to a PV, the accessModes must match and the PV’s capacity must be greater than or equal to the PVC’s request (2Gi ≥ 1Gi). A PV is a cluster-wide resource so it has no namespace, while the PVC and Pod must live in the same namespace. With reclaimPolicy: Retain, the PV’s data survives even after you delete the PVC, and the PV goes into the Released state.
Task 13 (6 points): Storage #
In namespace storage, create a StorageClass fast for dynamic provisioning. The provisioner is rancher.io/local-path, volumeBindingMode is WaitForFirstConsumer, and allowVolumeExpansion is true. Then create a 3Gi PVC pvc-fast using this StorageClass.
Solution
Define the StorageClass and PVC.
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast
provisioner: rancher.io/local-path
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-fast
namespace: storage
spec:
accessModes:
- ReadWriteOnce
storageClassName: fast
resources:
requests:
storage: 3Gik apply -f sc.yaml
k -n storage get pvc pvc-fastExplanation: WaitForFirstConsumer defers volume creation until a Pod that uses the PVC is scheduled, so a PVC on its own staying in Pending is normal. allowVolumeExpansion: true is required to later grow the PVC’s resources.requests.storage for online expansion. If you don’t specify storageClassName on a PVC, the default StorageClass is used.
Task 14 (7 points): Troubleshooting #
The worker node node01 in context cluster1 is in NotReady state. Diagnose the cause and bring the node back to Ready.
Solution
First check the node status and conditions.
k get nodes
k describe node node01Connect to the node and look at the kubelet status and logs.
ssh node01
sudo -i
systemctl status kubelet
journalctl -u kubelet -eIf kubelet is stopped, restart it; if it’s a config error, fix the file the logs point to and then restart.
systemctl enable --now kubelet
systemctl restart kubeletExplanation: The most common causes of NotReady are that the kubelet service died (systemctl restart kubelet), a kubeconfig/CA path error, or disk/memory pressure. The Conditions in describe node and journalctl -u kubelet pin down the cause, and if kubelet is inactive at boot, enable --now turns on automatic start at boot too. The key is recognizing that this is not a control plane problem but something you must look at on the node itself.
Task 15 (7 points): Troubleshooting #
On the control plane of context cluster2, kubectl fails to respond with connection refused. The apiserver static Pod manifest has been edited incorrectly. Find the cause, fix it, and bring the apiserver back to normal.
Solution
Connect to the control plane node and look at the apiserver container status.
ssh cluster2-controlplane
sudo -i
crictl ps -a | grep kube-apiserver
crictl logs <apiserver-container-id>Review the static Pod manifest and fix the wrong field (for example, a typo’d port, a wrong certificate path, or broken indentation).
vim /etc/kubernetes/manifests/kube-apiserver.yamlWait for kubelet to detect the manifest change and bring the apiserver Pod back up, then verify.
crictl ps | grep kube-apiserver
k get nodesExplanation: kubectl only works once the apiserver is up, so in this state the key is to read the container logs directly with crictl instead of kubectl. A static Pod is automatically recreated by kubelet when you edit its manifest file, so moving the file out and back briefly produces a forced restart. A typo in the --etcd-servers path or a certificate path is a common cause.
Task 16 (7 points): Troubleshooting #
Requests sent to the Service frontend in namespace apps are not responding. The Pods are all Running, but the Service’s endpoints are empty. Diagnose the cause and fix it so traffic reaches the Pods.
Solution
Compare the Service, the endpoints, and the Pod labels.
k -n apps get svc frontend -o wide
k -n apps get endpoints frontend
k -n apps describe svc frontend
k -n apps get pods --show-labelsIf the Service’s selector is out of step with the Pods’ labels, fix the selector to match the Pod labels.
k -n apps patch svc frontend \
-p '{"spec":{"selector":{"app":"frontend"}}}'
k -n apps get endpoints frontendExplanation: Empty endpoints are a sign that the Service’s selector matches no Pod label. Even when Pods are Running, if the selector is out of step they aren’t registered in the endpoints and traffic can’t reach them. After fixing the selector, verify that get endpoints fills in with Pod IPs, and check for port name/number mismatches at the same time.
Task 17 (7 points): Troubleshooting #
The Pod report in namespace apps is in CrashLoopBackOff state. Diagnose the cause, save the previous logs of the most recently terminated container to the file /tmp/report.log, and if the cause is out of memory (OOMKilled), raise the memory limit to 256Mi to bring it back to normal.
Solution
Diagnose the state and the termination cause.
k -n apps describe pod report
k -n apps logs report --previous > /tmp/report.logIf Last State is OOMKilled, raise the memory limit. A Pod’s resources can’t be edited in place, so pull the manifest, fix it, and recreate.
k -n apps get pod report -o yaml > report.yaml resources:
limits:
memory: "256Mi"
requests:
memory: "128Mi"k -n apps delete pod report
k apply -f report.yamlExplanation: CrashLoopBackOff is a state where the container repeatedly terminates, so the current log may be empty — use --previous (-p) to pull the logs of the container that terminated just before. If the Last State Reason in describe is OOMKilled, an insufficient memory limit is the cause. A running Pod’s resources are immutable, so you must edit the manifest and delete,recreate it, and if the Pod is managed by a Deployment, fix the Deployment’s template instead.
Scoring criteria #
Grade by summing each task’s points. The total is 100, and 66 or higher is the passing zone.
| Domain | Tasks , points | Subtotal |
|---|---|---|
| Cluster Architecture, Installation, Configuration | 1(8) , 2(8) , 3(8) , 4(8) , 5(6) | 38 |
| Workloads and Scheduling | 6(6) , 7(7) , 8(7) | 20 |
| Services and Networking | 9(6) , 10(6) , 11(6) | 18 |
| Storage | 12(8) , 13(6) | 14 |
| Troubleshooting | 14(7) , 15(7) , 16(7) , 17(7) | 28 |
| Total | (total) | 100 |
Grading is result-based, just like the real exam. It looks not at how you typed the commands but at whether the resources you created and the cluster state you recovered match the requirements. Even within a single task, partial credit is split by item — labels, ports, paths, fields — so even when you’re stuck on one item, filling in the parts you can to the end is better for your score.
Reviewing weak domains #
After grading, go back to the corresponding post in the table below for any low-scoring domain and review it.
| Domain | Related tasks | Posts to review |
|---|---|---|
| Cluster Architecture, Installation, Configuration | 1, 2, 3, 4, 5 | #6 , #7 , #8 , #9 |
| Workloads and Scheduling | 6, 7, 8 | #11 , #13 , #14 |
| Services and Networking | 9, 10, 11 | #18 , #19 , #20 |
| Storage | 12, 13 | #16 , #17 |
| Troubleshooting | 14, 15, 16, 17 | #22 , #23 , #24 , #25 |
If you ran short on time on a particular task, the issue may be hand speed rather than domain knowledge. In that case, re-read #1 setup and #26 time management, and solve the same 17 tasks once more against the clock. Once etcd backup/restore and node troubleshooting are in your hands, the time per task drops noticeably.
Closing the series #
Starting from the exam environment setup in #1, we passed through every CKA domain across 27 posts — control plane, nodes, kubeadm install, HA, upgrades, etcd backup/restore, certificates, RBAC, workloads, scheduling, resources, storage, Service, Ingress, NetworkPolicy, and the four flavors of troubleshooting. If you cleared 66 points on this mock, you have built hands that can clear the pass line in the real exam room too. Congratulations.
If CKA was the hands-on exam for the cluster operator, the next step is CKS (Certified Kubernetes Security Specialist), which keeps that cluster secure — and the feel for etcd, certificates, RBAC, and troubleshooting you built here is a direct stepping stone into it, so carry the momentum of passing straight into the next hands-on exam.