Certified Kubernetes Application Developer (CKAD) #6 Workloads 2: DaemonSet, StatefulSet
In #5 Workloads 1: Deployment, ReplicaSet, rolling update and rollback, we ran a bundle of stateless Pods with a Deployment and worked through rolling updates and rollbacks. But Deployment is not the only workload controller in Kubernetes. Work that has to place exactly one identical Pod on every node, and work where each Pod needs its own identity and dedicated storage, cannot be solved with a Deployment.
This post covers the two controllers responsible for those cases — DaemonSet and StatefulSet — from a hands-on angle. Since CKAD is an exam where you build things directly in a bare terminal, we will go over the concept of each resource and then immediately get our hands on them with YAML and kubectl.
Two things Deployment cannot solve #
Deployment assumes a set of interchangeable, stateless Pods. Set replicas to 3 and the scheduler decides on its own which node gets which Pod; if a Pod dies, it is simply replaced by an identical new one. Their names even get a random hash, like web-7d8f....
But in practice there is work that does not fit this assumption.
- A Pod that has to run exactly one per node. A log collector or a node monitoring agent needs the Pod count to match the node count. This is where you use a DaemonSet.
- A Pod where each one keeps a unique name and dedicated disk. Each node of a database cluster owns its own data and must come back with the same identity after a restart. This is where you use a StatefulSet.
These two controllers are the topic of this post.
DaemonSet: one Pod on every node #
A DaemonSet is a controller that places exactly one Pod on all (or some) nodes of a cluster. When a new node is added, a Pod automatically comes up on it too; when a node leaves, its Pod disappears along with it. There is no concept of replicas. The node count is the Pod count.
Where it is used #
DaemonSets are mainly used for system components that need to operate on a per-node basis.
- Log collectors. Agents like Fluentd or Fluent Bit that scrape each node’s logs and ship them to a central place
- Node monitoring. Agents like node-exporter that collect a node’s CPU, memory, and disk metrics
- CNI and storage plugins. Components that install networking or storage capabilities on each node
Limiting to some nodes with nodeSelector and tolerations #
By default, a DaemonSet places a Pod on every worker node except nodes tainted as control plane. If you want to place it only on specific nodes, specify a label with nodeSelector.
spec:
template:
spec:
nodeSelector:
disktype: ssdConversely, if you need to place a Pod on tainted nodes such as control plane nodes, allow that taint with tolerations. Monitoring agents have to watch control plane nodes too, so this setting is used often.
spec:
template:
spec:
tolerations:
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: NoScheduleUpdate strategy #
A DaemonSet’s updateStrategy comes in two forms.
- RollingUpdate (default).
maxUnavailablecaps how many Pods are replaced at once, replacing them sequentially. - OnDelete. It does not replace Pods automatically; a new version comes up only when the user deletes the existing Pod manually.
DaemonSet YAML example #
There is no kubectl create generator for DaemonSet. On the exam, the fastest approach is to generate a Deployment skeleton with a generator, then change kind to DaemonSet and remove the replicas and strategy lines.
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: log-agent
namespace: logging
spec:
selector:
matchLabels:
app: log-agent
template:
metadata:
labels:
app: log-agent
spec:
tolerations:
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: NoSchedule
containers:
- name: fluent-bit
image: fluent/fluent-bit:2.2
resources:
limits:
memory: 200Mi
requests:
cpu: 100m
memory: 200MiCreation and verification look like this.
k apply -f ds.yaml
k get daemonset -n logging
k get pods -n logging -o wide # check that one is up on each nodeIf the DESIRED and CURRENT values match the node count, it is healthy.
StatefulSet: Pods with identity and storage #
A StatefulSet is a controller that gives each Pod a stable, unique identity. Unlike Deployment, which attaches random hash names, a StatefulSet gives Pods ordered, fixed names like web-0, web-1, web-2.
The three things a StatefulSet guarantees #
- Stable network ID. Each Pod has a fixed name like
web-0and a fixed DNS name in the formweb-0.web.default.svc.cluster.local. The name and DNS stay the same even after the Pod restarts. - Ordered creation and deletion. Pods are created one at a time in the order
web-0→web-1→web-2, and deletion and scale-down proceed in reverse order. The next Pod comes up only after the previous one is Running. - Stable storage. Each Pod has its own dedicated PVC created from
volumeClaimTemplates. Even when the Pod is rescheduled, it reconnects to the same PVC and the data is preserved.
Why a headless Service is needed #
A StatefulSet must be defined together with a headless Service. A headless Service is a Service configured with clusterIP: None; it does not allocate a cluster IP and instead creates an individual DNS record for each Pod. Put this Service’s name in the StatefulSet’s serviceName, and you can reach each Pod directly at addresses like web-0.web and web-1.web. When you need to connect directly to a specific member of a database cluster (for example, the primary), this stable address is essential.
Per-Pod PVC with volumeClaimTemplates #
volumeClaimTemplates is the mold a StatefulSet uses to automatically stamp out a PVC for each Pod. With replicas of 3, the PVCs data-web-0, data-web-1, and data-web-2 are each created. One caveat: deleting the StatefulSet does not automatically delete these PVCs. Since the point is to preserve data, this is intended behavior; to clean up, you have to delete the PVCs yourself.
When to use it #
A StatefulSet is used for workloads where each instance has a unique identity and data.
- Databases. Cases where data differs per node, like a replicated PostgreSQL or MySQL setup
- Distributed systems. Clusters where roles and ordering among members matter, like Kafka, ZooKeeper, or Elasticsearch
StatefulSet YAML example (headless Service + volumeClaimTemplates) #
StatefulSet has no generator either, so you write it by hand. Keeping the headless Service and the StatefulSet together in one file makes management easier.
apiVersion: v1
kind: Service
metadata:
name: web # must match serviceName
namespace: default
spec:
clusterIP: None # headless: no cluster IP allocated
selector:
app: web
ports:
- port: 80
name: http
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
namespace: default
spec:
serviceName: web # name of the headless Service above
replicas: 3
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
spec:
containers:
- name: nginx
image: nginx:1.25
ports:
- containerPort: 80
name: http
volumeMounts:
- name: data
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 1GiAfter creating it, verify that the Pod names and PVCs were created in order.
k apply -f sts.yaml
k get statefulset web
k get pods -l app=web # created in order web-0, web-1, web-2
k get pvc # data-web-0, data-web-1, data-web-2You scale with the same command as a Deployment, but Pods are torn down in reverse order.
k scale statefulset web --replicas=5 # adds web-3, web-4
k scale statefulset web --replicas=2 # deletes web-4, web-3, web-2 in reverseDeployment vs StatefulSet vs DaemonSet comparison #
The differences among the three controllers can be summarized in a single table.
| Item | Deployment | StatefulSet | DaemonSet |
|---|---|---|---|
| Pod identity | random hash name | ordered fixed name (web-0,1,2) | one per node |
| Pod count | set by replicas | set by replicas | equal to node count |
| Creation/deletion order | no ordering guarantee | ordered (forward on create, reverse on delete) | tied to node add/remove |
| Dedicated storage | none (shared or stateless) | per-Pod PVC (volumeClaimTemplates) | usually node hostPath |
| headless Service | not needed | required (serviceName) | not needed |
| Typical use | stateless web/API | DB/distributed systems | log/monitoring/CNI |
The core decision criterion is simple. If they are interchangeable, Deployment; if they need a unique identity and dedicated data, StatefulSet; if you have to install one per node, DaemonSet.
Exam points #
Here are the parts that often trip people up when these two controllers show up on CKAD.
- No generator. You cannot create a skeleton for DaemonSet or StatefulSet with
kubectl create. The fastest approach is to generate a Deployment skeleton withk create deploy ... $do > x.yaml, then change thekindand remove the unnecessary fields. - What to remove when converting to a DaemonSet. In the Deployment skeleton, you must delete the
replicas,strategy, andstatuslines for it to become a valid DaemonSet. - Missing serviceName on a StatefulSet.
serviceNameis a required field, and the headless Service (clusterIP: None) it points to must actually exist for the Pod’s DNS to work. Missing either one of these costs points. - Matching names between volumeClaimTemplates and volumeMounts. The
metadata.nameofvolumeClaimTemplatesand thenameof the container’svolumeMountsmust match for the PVC to be mounted. - PVCs remain. Deleting the StatefulSet does not automatically delete the PVCs. If the problem also asks for cleanup, delete the PVCs separately.
- Verifying the DaemonSet’s Pod count. You validate the answer with
k get pods -o wide, checking that one is up on each node. If it also has to run on control plane nodes, do not omit the tolerations.
Wrap-up #
Here is what we nailed down in this post.
- A DaemonSet places one Pod on all (or some) nodes. It is used for log collectors, node monitoring, and CNI, and you limit the target nodes with
nodeSelectorandtolerations. - A StatefulSet provides a stable network ID, ordered creation/deletion, and dedicated storage. It is used for DBs and distributed systems, and the headless Service and
volumeClaimTemplatesare the core. - A headless Service uses
clusterIP: Noneto create an individual DNS for each Pod, providing fixed addresses likeweb-0.web. - Criteria for choosing among the three controllers. Interchangeable → Deployment, unique identity/data → StatefulSet, one per node → DaemonSet.
- Exam cautions. No generator, serviceName required, volume names must match, PVCs remain.
If you need the bigger picture of workload controllers, the Kubernetes intermediate track covers the same resources from an operations perspective in more depth.
Next: Workloads 3 #
Having learned DaemonSet and StatefulSet, we have covered the always-running workloads. What remains is work that runs once or periodically and then finishes.
In #7 Workloads 3: Job, CronJob (backoff, concurrency), we will cover Job and CronJob, which handle batch work. We will work through it in YAML — controlling parallel execution with completions and parallelism, handling failures and timeouts with backoffLimit and activeDeadlineSeconds, and CronJob’s concurrencyPolicy and schedule notation.