K8s Basics #4: Deployment and ReplicaSet — Declarative Deploys and Rolling Updates
As #3 kubectl and your first Pod ended, a Pod created directly just disappears. This post covers the first controller that fills that gap automatically — Deployment — and the ReplicaSet sitting underneath it. In one cycle: declaring replicas: 3 to keep Pods alive, watching auto-recovery when one Pod is deleted, and seeing how rolling updates and rollbacks behave when you change an image tag.
This series is K8s Basics, 7 posts.
- #1 What is Kubernetes — why do we need a container orchestrator?
- #2 Local environments — minikube / kind / Docker Desktop k8s
- #3 kubectl and your first Pod
- #4 Deployment and ReplicaSet — declarative deploys and rolling updates ← this post
- #5 Service — ClusterIP / NodePort / LoadBalancer
- #6 ConfigMap / Secret
- #7 Namespaces and labels
By the end of this post you’ll have the first manifest that hands Pod management off to a controller instead of creating Pods by hand. From here on this is essentially the baseline shape of manifests in real-world ops.
Deployment, ReplicaSet, Pod — three layers #
The cast for this post is three resources, but the human only writes the top one. A handy mental picture:
┌──────────────────────┐
│ Deployment │ ← what you write in the manifest
│ (web) │
└──────────┬───────────┘
│ creates / manages
▼
┌──────────────────────┐
│ ReplicaSet │ ← created automatically by Deployment
│ (web-abc123) │
└──────────┬───────────┘
│ creates / maintains
▼
┌──────────┬──────────┬──────────┐
│ Pod │ Pod │ Pod │ ← actual workload
│ web-... │ web-... │ web-... │
└──────────┴──────────┴──────────┘One-line responsibilities:
- Deployment — the manifest the human writes. Declares “spin up N Pods of this template, and here’s how to switch to a new version (rolling update).” Effectively the resource you touch most often in real-world ops.
- ReplicaSet — the intermediate object Deployment creates automatically. One job — “keep N Pods of this template alive at all times.” You almost never write a ReplicaSet manifest yourself.
- Pod — the actual workload. ReplicaSet creates them; if one dies, ReplicaSet creates another. The same Pods we hand-created in #3 — but now if one dies, a replacement comes back on its own.
Why two layers? #
It can look like one Deployment layer should be enough. Why does ReplicaSet exist as its own thing? The reason is new-version deploys.
When you push a new version, Deployment creates a brand-new ReplicaSet. It scales the new RS replicas up — 1, 2, 3 — while scaling the old RS down — 3, 2, 1. There’s a brief window where Pods from both RSes are up at the same time. That’s the heart of a rolling update. When it finishes, the old RS sits at 0 but stays around as an object — so a rollback can scale it back up.
In short, Deployment is the layer that handles transitions between versions, and ReplicaSet is the layer that maintains a single version at N. Splitting them lets old and new versions coexist briefly in the same cluster.
Your first Deployment manifest #
This time we write the same nginx:1.27 not as a Pod but as a Deployment. Save the file as web.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: web
labels:
app: web
spec:
replicas: 3
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
spec:
containers:
- name: web
image: nginx:1.27
ports:
- containerPort: 80Three new pieces compared to the Pod manifest in #3:
apiVersion: apps/v1— Pod wasv1, but Deployment lives in theapps/v1API group. Controller-style resources (Deployment, StatefulSet, DaemonSet, ReplicaSet) all share that group.spec.replicas: 3— the declaration that 3 Pods of this template should always be up.spec.selector.matchLabels+spec.template— the label condition Deployment uses to find the Pods it manages, and the template describing the Pod’s shape. The shape undertemplateis exactly themetadata+specof the Pod we saw in #3.
One rule — selector and template labels must match #
The most common mistake when writing your first manifest is here. spec.selector.matchLabels and spec.template.metadata.labels have to match. If they don’t, K8s rejects the manifest. It’s not just a convention — it’s a validation rule the API server enforces.
That’s why both fields above are app: web. If you set the selector to app: web but change the template’s label to app: nginx, kubectl apply will spit out:
The Deployment "web" is invalid: spec.template.metadata.labels:
Invalid value: map[string]string{"app":"nginx"}:
`selector` does not match template `labels`Simple mental model — the selector says “how I recognize the Pods I manage,” the template says “the labels on the Pods I create.” They have to match so the controller recognizes the Pods it just created — almost a tautology.
Apply it #
Push web.yaml to the cluster.
kubectl apply -f web.yamldeployment.apps/web createdLet’s see all three resource kinds at once. kubectl get accepts a comma-separated list.
kubectl get deploy,rs,podsNAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/web 3/3 3 3 20s
NAME DESIRED CURRENT READY AGE
replicaset.apps/web-abc123 3 3 3 20s
NAME READY STATUS RESTARTS AGE
pod/web-abc123-aa11 1/1 Running 0 20s
pod/web-abc123-bb22 1/1 Running 0 20s
pod/web-abc123-cc33 1/1 Running 0 20sHow to read it:
- Deployment row —
READY 3/3means all 3 desired replicas are ready,UP-TO-DATE 3is the count of Pods updated to the current template, andAVAILABLE 3is the count alive long enough (past minReadySeconds) to take traffic. - ReplicaSet row — the name is
web-abc123. Theabc123suffix is a hash K8s computes from the template. The columns to read areDESIRED 3 / CURRENT 3 / READY 3. - Pod row — names like
web-abc123-aa11carry two suffixes. The first part (web-abc123) matches the ReplicaSet name. The chain of who-created-what is right there in the name.
The naming pattern in one line — <deployment>-<replicaset-hash>-<pod-suffix>. You’ll see it constantly through the rest of the series.
Killing a Pod — self-healing #
Time to see what this manifest changed. In #3, deleting a Pod just made it disappear. This time it’s different.
kubectl delete pod web-abc123-aa11pod "web-abc123-aa11" deletedPull the Pod list right after.
kubectl get podsNAME READY STATUS RESTARTS AGE
web-abc123-bb22 1/1 Running 0 2m
web-abc123-cc33 1/1 Running 0 2m
web-abc123-dd44 1/1 Running 0 5sStill three Pods. Look closely — bb22 and cc33 show AGE 2m, but the new dd44 shows AGE 5s. A freshly created Pod. The changed suffix is another hint that this is a new one.
This is the reconcile loop from #1 at work. ReplicaSet holds “3 Pods should exist,” and the moment a human deleted one, desired (3) and actual (2) diverged. The ReplicaSet controller in controller-manager noticed the gap and asked the API server to create one more Pod. The scheduler picked a node, the kubelet started the container, and we’re back to 3. The human did nothing.
The same thing happens at the node level. If a node hosting some Pods dies, K8s relocates them to other live nodes. The “service stays up when a node dies” line from #1 is, fundamentally, what this ReplicaSet controller solves.
Adjusting replicas #
When 3 isn’t enough — or it’s too many — there are two ways to adjust.
Declarative — change the number in the manifest and apply again. The cleanest way.
spec:
replicas: 5
...kubectl apply -f web.yamldeployment.apps/web configuredkubectl get pods will soon show 5. Scaling down is the same — drop the number in the manifest and apply.
Imperative — fast, but temporary.
kubectl scale deploy/web --replicas=5deployment.apps/web scaledUseful for a quick burst up or down. The downside is clear — the manifest’s replicas value drifts from the cluster’s actual state. The manifest still says replicas: 3, but the cluster has 5 running. Next time someone runs kubectl apply -f web.yaml without thinking, those 5 collapse back to 3.
The one-line rule — the declarative manifest is always the source of truth. Use kubectl scale only for a quick fix during debugging, or when you’re about to sync the manifest immediately after. The normal flow is: edit the manifest, then apply. That principle is the foundation of the entire desired-state model from #1.
Rolling updates — the default of zero-downtime deploys #
Now the first new-version deploy of this series. Change the image tag from nginx:1.27 to nginx:1.28 — one character.
containers:
- name: web
image: nginx:1.28
ports:
- containerPort: 80kubectl apply -f web.yamldeployment.apps/web configuredThe output is one line, but a fair amount happens behind it.
What’s happening underneath #
The Deployment controller notices the template changed and creates a new ReplicaSet for the new template. It scales the new RS replicas up from 0 to 1, 2, 3, and scales the old RS down from 3 to 2, 1, 0. At each step the new Pod has to reach Ready before the next step happens.
In the middle of a rollout, kubectl get rs shows two ReplicaSets.
kubectl get rsNAME DESIRED CURRENT READY AGE
web-abc123 2 2 2 10m ← old RS (1.27)
web-def456 2 2 1 30s ← new RS (1.28)Old RS down to 2, new RS up to 2. That snapshot is the heart of a rolling update. Pods from both RSes are up briefly — and the traffic across them is split evenly by the Service we’ll cover in #5.
Watching progress #
The most convenient one-liner to track a rollout:
kubectl rollout status deploy/webWaiting for deployment "web" rollout to finish: 1 out of 3 new replicas have been updated...
Waiting for deployment "web" rollout to finish: 2 out of 3 new replicas have been updated...
Waiting for deployment "web" rollout to finish: 1 old replicas are pending termination...
deployment "web" successfully rolled outSteps print one line at a time, and a success line at the end means the deploy is done. After that, kubectl get rs shows the old RS at DESIRED 0 but still around as an object. That structure makes the next section’s rollback possible.
One line on the default strategy #
The flow above happens because Deployment’s spec.strategy defaults to RollingUpdate, with two parameters:
maxSurge: 25%— how many extra Pods over desired are allowed temporarily. With 3 desired, +1 is allowed.maxUnavailable: 25%— how many missing Pods under desired are allowed temporarily. With 3 desired, -1 is allowed.
The other option is Recreate — kill all old Pods first, then start new ones. No zero-downtime, but useful for stateful workloads where two versions can’t coexist (e.g., a DB migration holding the same volume). For an ordinary web server, the default RollingUpdate is enough.
What if the rollout fails? #
Try a wrong image tag on purpose — say nginx:1.99-not-real.
kubectl apply -f web.yamlkubectl rollout status deploy/web hangs for a long time, and kubectl get pods shows a freshly created Pod stuck at ImagePullBackOff.
NAME READY STATUS RESTARTS AGE
web-abc123-aa11 1/1 Running 0 15m
web-abc123-bb22 1/1 Running 0 15m
web-abc123-cc33 1/1 Running 0 15m
web-ghi789-zz99 0/1 ImagePullBackOff 0 40sNotably, the three old Pods are alive and well. If the new Pod can’t reach Ready, Deployment refuses to advance to the next step. So it doesn’t scale the old RS down to 0. Even with the rollout stuck, the old version keeps taking traffic normally. That’s the core of zero-downtime.
The debug order is the same as the closing pattern from #3:
kubectl describe deploy/web
kubectl describe pod web-ghi789-zz99describe deploy’s Events show something like ReplicaSet ... has timed out progressing, and describe pod’s Events have Failed to pull image "nginx:1.99-not-real". The answer is almost always in those two outputs.
Rollback #
If a bad version made it out, rolling back is one command.
kubectl rollout history deploy/webdeployment.apps/web
REVISION CHANGE-CAUSE
1 <none>
2 <none>A revision list. 1 is the original nginx:1.27; 2 is the nginx:1.28 we just rolled out. To roll back to the previous revision:
kubectl rollout undo deploy/webdeployment.apps/web rolled backTo pick a specific revision, use --to-revision:
kubectl rollout undo deploy/web --to-revision=1The reason this works boils down to one line — a revision is just an old ReplicaSet still hanging around. When the new version was deployed, the old ReplicaSet didn’t disappear; it sat at replicas: 0 but stayed as an object. undo is just scaling that old RS back to N. So the old version takes traffic again almost instantly.
By default K8s keeps 10 revisions. spec.revisionHistoryLimit raises or lowers it. Too long, and old ReplicaSets clutter the registry; too short, and you can’t reach back to a much older version in one step. Match it to your deploy frequency — for a typical web service, the default 10 is fine.
What Deployment doesn’t solve #
Deployment doesn’t fit every workload shape. The ones with a different grain:
- Stateful workloads — for things like databases where each instance needs its own name and its own disk,
StatefulSetis the right resource. Pod names are stable (web-0,web-1), and a 1:1 PVC defined in the manifest is attached to each. Start order is also guaranteed (0→1→2). Deployment treats Pod names and disks as randomized, so it’s not a fit for DBs. - One-per-node workloads — log shippers (Fluent Bit, Filebeat), node monitors (Node Exporter), CNI agents —
DaemonSetis right. When a new node joins the cluster, one shows up on it automatically. - One-shot jobs — migrations, backups, batch jobs — anything that runs once and finishes — use
Job(immediate) orCronJob(scheduled). Workloads where Pods naturally end up inSucceeded.
These three are covered one at a time in K8s Intermediate. This series stays focused on Deployment, the one you touch most. But it’s worth getting the categorization right in your head ahead of time — stateless → Deployment, stateful → StatefulSet, one-per-node → DaemonSet, one-shot → Job.
Clean up #
Tear down today’s resources cleanly. Deleting one Deployment removes its ReplicaSets and Pods underneath — K8s handles that through owner references and garbage collection.
kubectl delete -f web.yamldeployment.apps "web" deletedkubectl get deploy,rs,podsNo resources found in default namespace.By name works too:
kubectl delete deploy webDeleting the Deployment alone also removes its ReplicaSet and Pods. The owner-reference model works the same way throughout the rest of the series.
Summary #
What this post pinned down:
- Deployment / ReplicaSet / Pod — three layers. The human only writes Deployment. ReplicaSet is the auto-created middle object; Pod is the workload it produces.
- The manifest spine is
apiVersion: apps/v1/kind: Deployment/metadata/spec. Insidespec, the new fields arereplicas,selector.matchLabels,template— and selector and template labels must match. - Force-deleting a Pod gets it replaced soon after by the ReplicaSet controller, which closes the gap between desired (N) and actual — the simplest face of the reconcile loop from #1.
- For replicas adjustments, edit the manifest and apply;
kubectl scaleis temporary. The declarative manifest is always the source of truth. - Rolling updates work by creating a new ReplicaSet and gradually emptying the old one. Default strategy is
RollingUpdate(maxSurge 25%,maxUnavailable 25%); track withkubectl rollout status. - Rollback is one command —
kubectl rollout undo. It works because the old ReplicaSet was sitting atreplicas: 0the whole time.
Next — Service #
Even now, one thing isn’t solved — how do you get traffic from outside the cluster to those Pods? Our 3 nginx Pods have cluster-internal IPs, but those IPs change every time a Pod dies and is recreated. ReplicaSet keeps the Pods alive, but the moving IPs leave clients without a stable place to connect to.
#5 Service — ClusterIP / NodePort / LoadBalancer covers (1) how Service puts a stable virtual IP / DNS name in front of Pods, (2) the three types — ClusterIP for in-cluster traffic, NodePort to expose via node ports, LoadBalancer to attach a cloud LB, and (3) putting a Service in front of the app: web Pods we just built and making the first external connection. The 3 Pods from this post turn into “a service with an address” for the first time there.