Docker Advanced #6: Production Operations — graceful shutdown, healthcheck, restart
The final post of Docker Advanced. Build / multi-arch / security / resource limits — all the previous posts dealt with the shape of one container. This post collects the details that keep a container shutting down cleanly and recovering reliably in production.
This post in the Docker Advanced series:
- #1 BuildKit and buildx
- #2 Multi-architecture images
- #3 Image security
- #4 SBOM and signing
- #5 Resource limits and cgroups
- #6 Production operations — restart policy, healthcheck, graceful shutdown ← this post
What docker stop actually does — once more, deeper
#
The territory briefly touched in Basics #3. From an operations angle:
docker stop myapp
│
▼
SIGTERM is sent to the container's PID 1
│
▼
Wait 10 seconds by default (--time to adjust)
│
├─ PID 1 exits cleanly → that exit code stays
│
└─ Timeout → SIGKILLThe whole weight of this flow rides on PID 1. PID 1 must catch SIGTERM, propagate it to its children, and finish its own cleanup — only then does the container shut down gracefully.
The PID 1 problem — common breakage inside containers #
PID 1 is special on Linux:
- Adopts orphaned children — must reap zombies
- Different signal-delivery rules — signals without an explicit handler are ignored
Most apps (Python, Node, Java) were never designed to run as PID 1. Two problems result inside containers:
Problem 1 — SIGTERM is ignored #
Most runtimes ignore SIGTERM unless a handler is explicitly registered. Docker’s SIGTERM goes unhandled, and 10 seconds later SIGKILL takes the container out forcefully.
Diagnosis:
docker run --rm -d --name test myapp
docker stop test
# If shutdown takes ~10 seconds, SIGTERM is likely being ignored.
# If it ends in 1–2 seconds, you're good.Problem 2 — Zombie process accumulation #
If the app spawns child processes (Node’s child_process.spawn, Python’s subprocess.Popen followed by quick exits) and the parent doesn’t wait() for them, they become zombies. Normally an init process adopts and reaps them, but if the container’s PID 1 doesn’t do that role, zombies pile up.
Solution — a small init at PID 1 #
The fix is to put a small init at PID 1 with the app underneath. Docker provides this with one option.
docker run --init -d myappservices:
web:
image: myapp
init: true--init runs tini as PID 1 and your Dockerfile’s CMD becomes its child. tini:
- Forwards received SIGTERMs to children
- Automatically reaps zombies
That one line solves both problems. Almost always set init: true on production containers.
dumb-init — baked into the Dockerfile
#
The other path: run dumb-init (Yelp) as ENTRYPOINT.
FROM python:3.14-slim
RUN apt-get update && apt-get install -y --no-install-recommends dumb-init && \
rm -rf /var/lib/apt/lists/*
COPY app.py .
ENTRYPOINT ["dumb-init", "--"]
CMD ["python", "app.py"]dumb-init -- becomes PID 1 with Python as its child. Same idea as tini. Docker’s --init is lighter and preferred, but when you don’t know where the image will run (no guarantee --init will be set), baking dumb-init into the image is safer.
The app’s own SIGTERM handler #
Even with init handling signal delivery, the app must respond to the signal for graceful to actually happen. Quick patterns per language.
Node.js #
const server = app.listen(3000);
const shutdown = () => {
console.log('Received SIGTERM, draining connections...');
server.close(() => {
console.log('All connections drained, exiting.');
process.exit(0);
});
// Force exit if not finished cleanly within 30 seconds
setTimeout(() => {
console.error('Force exit after 30s');
process.exit(1);
}, 30000).unref();
};
process.on('SIGTERM', shutdown);
process.on('SIGINT', shutdown);server.close() rejects new connections and calls back when in-flight requests finish. Get under the SIGTERM grace window (10s by default, or your extended value) to avoid SIGKILL.
Python (FastAPI / Django) #
Production servers like uvicorn / gunicorn handle SIGTERM automatically — you rarely write this yourself. Just ensure workers have enough time to finish in-flight requests.
gunicorn app:app \
--workers 4 \
--graceful-timeout 30 \
--timeout 60 \
--bind 0.0.0.0:8000--graceful-timeout 30 — keep handling requests for 30s after SIGTERM. Match Docker’s stop timeout:
services:
web:
stop_grace_period: 35s # slightly longer than gunicorn's graceful-timeoutGo #
srv := &http.Server{Addr: ":8000", Handler: mux}
go srv.ListenAndServe()
stop := make(chan os.Signal, 1)
signal.Notify(stop, syscall.SIGTERM, syscall.SIGINT)
<-stop
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
srv.Shutdown(ctx)http.Server.Shutdown is the canonical pattern — finish in-flight requests within a context timeout, then exit.
stop_grace_period — extending the window
#
For apps that can’t finish cleanly in 10 seconds (e.g., processing large file uploads):
services:
web:
stop_grace_period: 60sdocker stop --time 60 myappA common production tuning. But — too long slows deployments, and load balancers like ELB may already mark the backend unhealthy, making the long window pointless.
Restart policies — deeper #
The table from Intermediate #4, now from an operations angle.
| Policy | When |
|---|---|
no | One-shot containers (migrations, seeds, builds) |
always | Always — even on host boot |
on-failure[:N] | Non-zero exit code, with a max retry count |
unless-stopped | Always, unless explicitly stopped |
Safe production default — unless-stopped
#
The difference between always and unless-stopped confuses people. The distinction is what docker stop means:
- With
always, stopping withdocker stopand restarting the daemon brings the container back. - With
unless-stopped, a container stopped viadocker stopstays stopped across daemon restarts.
It’s more reasonable to honor an operator’s explicit stop, so unless-stopped is the production default.
Restart loop — backoff #
If an app dies on startup every time, restart: always becomes an infinite loop. Docker prevents that with a backoff that increases the gap between restarts.
1st failure → retry immediately
2nd failure → wait 100ms
3rd failure → wait 200ms
...
Nth failure → up to 1 minuteFor containers with many consecutive failures, follow the logs — docker logs --tail 200 <c> and check OOMKilled (#5).
Healthcheck — from operations #
The healthcheck from Intermediate #4, now from an operations angle.
Liveness vs. Readiness — two different questions #
Concepts from K8s, but equally useful as a thinking tool for Docker.
| Liveness | Readiness | |
|---|---|---|
| Question | Is it alive? | Is it ready to receive traffic? |
| On failure | Restart the container | Block traffic (don’t restart) |
| Example | Stuck in deadlock → needs restart | DB temporarily disconnected → just pause traffic |
Docker only has one healthcheck, no distinction. So in Docker-only setups, mixing both concepts in one healthcheck is awkward.
Workaround — define healthcheck closer to liveness, and handle readiness inside the app. For example, return 503 from /health for the first N seconds after startup, then 200 once ready.
Properties of a good healthcheck #
□ Responds quickly (within 1s)
□ Doesn't recursively check downstream services
□ No side effects
□ Dedicated endpoint (/health) — separated from regular traffic
□ No auth (don't add an attack surface; access from inside only)✗ Runs DB queries — DB load
✗ Runs business logic — dependencies / load
✗ Calls external APIs — when an external dep goes down, your container goes unhealthy
✗ Requires auth — health checkers also need credentialsA healthcheck only needs to confirm the app is alive and able to serve a request. The health of dependencies belongs to separate monitoring.
Startup grace — start_period
#
For apps with migrations or warm-up, a brief unhealthy period right after startup is normal.
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 10s
timeout: 5s
retries: 3
start_period: 60s # failures during the first 60 seconds aren't countedIn K8s, this is startupProbe.
Logging — operational details #
The stdout principle and log drivers from Intermediate #6, now from an operations angle.
Log rotation #
services:
web:
logging:
driver: json-file
options:
max-size: "10m"
max-file: "3"Without these, unbounded disk growth is a classic Docker incident. Set them daemon-wide for safety.
{
"log-driver": "local",
"log-opts": {
"max-size": "10m",
"max-file": "3"
}
}The local driver is an efficient json-file variant (compression + rotation by default). Becoming the production standard.
To external collectors #
As production scales, logs don’t stop at stdout — they flow to external collectors.
services:
web:
logging:
driver: fluentd
options:
fluentd-address: localhost:24224
tag: web.{{.Name}}From there, route to Loki / Elasticsearch / CloudWatch — anywhere. One-paragraph mention.
Monitoring — a one-line extension #
cAdvisor + Prometheus + Grafana from #5 is the first monitoring setup for Docker-only operations. Common panels:
- CPU / memory / network per container
- Restart count (alarm on containers restarting often)
- OOMKill events
- Healthcheck failure rate
- Disk IO
A first alarm rule: “the same container restarts 3+ times in 5 minutes.” Catch frequent OOMKills / crashes early.
Operations checklist #
The checklist for one container, gathering everything across the series:
□ Multi-stage — separate build tools ([Intermediate #1])
□ Base: slim or distroless ([Intermediate #1], [#3])
□ Multi-arch — linux/amd64 + linux/arm64 ([#2])
□ Dockerfile: hadolint clean ([#3])
□ Image: Trivy HIGH/CRITICAL clean ([#3])
□ SBOM attached + cosign signed ([#4])
□ Build with buildx + external cache ([Intermediate #2], [#1])□ image: digest or semver (no latest)
□ restart: unless-stopped
□ init: true (PID 1 handling)
□ stop_grace_period set (longer than the app's graceful time)
□ healthcheck — fast, light, no auth
□ Resources: mem_limit + cpus + pids_limit ([#5])
□ Security: read_only + tmpfs + cap_drop ALL + no-new-privileges ([#3])
□ Secrets: secrets: or external manager — never in ENV ([Intermediate #5], [#4])
□ Logs: max-size + max-file
□ DB / internal services: bind -p only to 127.0.0.1
□ Per-environment values: .env / override files ([Intermediate #4])□ Build → multi-arch → SBOM → sign → push in one workflow ([#4])
□ Make verification a gate — Trivy / cosign verify
□ Tagging: semver + Git SHA + latest together ([Basics #5])
□ External cache: type=gha or type=registry ([Intermediate #2])What’s next — Docker in Practice #
This series went deep on Docker itself. The next series — Docker in Practice — puts everything we’ve built into real app deploys:
- FastAPI containerization — a production-grade Dockerfile
- Django + PostgreSQL compose — admin / static / migration too
- React/Next.js build container — standalone, multi-stage
- Building images in CI — full GitHub Actions workflow
- Registry push and tag strategy — operational details
- Cloud deploy — one of Fly.io / Railway / ECS
A series where every tool we’ve built across Basics / Intermediate / Advanced finally comes together.
Wrap-up #
The picture from this post:
docker stop= SIGTERM → grace window → SIGKILL. PID 1’s signal handling is the core.- The PID 1 problem — apps weren’t designed for PID 1. Use
init: trueor dumb-init. - The app handles SIGTERM to finish in-flight work — Node’s
server.close, gunicorn’s--graceful-timeout, Go’ssrv.Shutdown. stop_grace_periodensures enough time to clean up.restart: unless-stoppedis the production default; backoff prevents infinite loops.- Healthcheck: fast, light, closer to liveness. Dependency checks belong elsewhere.
- Operations checklist split across image / runtime / deploy.