Docker Advanced #6: Production Operations — graceful shutdown, healthcheck, restart

9 min read

The final post of Docker Advanced. Build / multi-arch / security / resource limits — all the previous posts dealt with the shape of one container. This post collects the details that keep a container shutting down cleanly and recovering reliably in production.

This post in the Docker Advanced series:

What docker stop actually does — once more, deeper #

The territory briefly touched in Basics #3. From an operations angle:

docker stop flow
docker stop myapp
SIGTERM is sent to the container's PID 1
Wait 10 seconds by default (--time to adjust)
   ├─ PID 1 exits cleanly → that exit code stays
   └─ Timeout → SIGKILL

The whole weight of this flow rides on PID 1. PID 1 must catch SIGTERM, propagate it to its children, and finish its own cleanup — only then does the container shut down gracefully.

The PID 1 problem — common breakage inside containers #

PID 1 is special on Linux:

  • Adopts orphaned children — must reap zombies
  • Different signal-delivery rules — signals without an explicit handler are ignored

Most apps (Python, Node, Java) were never designed to run as PID 1. Two problems result inside containers:

Problem 1 — SIGTERM is ignored #

Most runtimes ignore SIGTERM unless a handler is explicitly registered. Docker’s SIGTERM goes unhandled, and 10 seconds later SIGKILL takes the container out forcefully.

Diagnosis:

Does the app receive SIGTERM?
docker run --rm -d --name test myapp
docker stop test
# If shutdown takes ~10 seconds, SIGTERM is likely being ignored.
# If it ends in 1–2 seconds, you're good.

Problem 2 — Zombie process accumulation #

If the app spawns child processes (Node’s child_process.spawn, Python’s subprocess.Popen followed by quick exits) and the parent doesn’t wait() for them, they become zombies. Normally an init process adopts and reaps them, but if the container’s PID 1 doesn’t do that role, zombies pile up.

Solution — a small init at PID 1 #

The fix is to put a small init at PID 1 with the app underneath. Docker provides this with one option.

--init
docker run --init -d myapp
compose
services:
  web:
    image: myapp
    init: true

--init runs tini as PID 1 and your Dockerfile’s CMD becomes its child. tini:

  • Forwards received SIGTERMs to children
  • Automatically reaps zombies

That one line solves both problems. Almost always set init: true on production containers.

dumb-init — baked into the Dockerfile #

The other path: run dumb-init (Yelp) as ENTRYPOINT.

Dockerfile
FROM python:3.14-slim
RUN apt-get update && apt-get install -y --no-install-recommends dumb-init && \
    rm -rf /var/lib/apt/lists/*
COPY app.py .
ENTRYPOINT ["dumb-init", "--"]
CMD ["python", "app.py"]

dumb-init -- becomes PID 1 with Python as its child. Same idea as tini. Docker’s --init is lighter and preferred, but when you don’t know where the image will run (no guarantee --init will be set), baking dumb-init into the image is safer.

The app’s own SIGTERM handler #

Even with init handling signal delivery, the app must respond to the signal for graceful to actually happen. Quick patterns per language.

Node.js #

server.js
const server = app.listen(3000);

const shutdown = () => {
  console.log('Received SIGTERM, draining connections...');
  server.close(() => {
    console.log('All connections drained, exiting.');
    process.exit(0);
  });

  // Force exit if not finished cleanly within 30 seconds
  setTimeout(() => {
    console.error('Force exit after 30s');
    process.exit(1);
  }, 30000).unref();
};

process.on('SIGTERM', shutdown);
process.on('SIGINT', shutdown);

server.close() rejects new connections and calls back when in-flight requests finish. Get under the SIGTERM grace window (10s by default, or your extended value) to avoid SIGKILL.

Python (FastAPI / Django) #

Production servers like uvicorn / gunicorn handle SIGTERM automatically — you rarely write this yourself. Just ensure workers have enough time to finish in-flight requests.

gunicorn options
gunicorn app:app \
  --workers 4 \
  --graceful-timeout 30 \
  --timeout 60 \
  --bind 0.0.0.0:8000

--graceful-timeout 30 — keep handling requests for 30s after SIGTERM. Match Docker’s stop timeout:

compose
services:
  web:
    stop_grace_period: 35s   # slightly longer than gunicorn's graceful-timeout

Go #

main.go
srv := &http.Server{Addr: ":8000", Handler: mux}
go srv.ListenAndServe()

stop := make(chan os.Signal, 1)
signal.Notify(stop, syscall.SIGTERM, syscall.SIGINT)
<-stop

ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
srv.Shutdown(ctx)

http.Server.Shutdown is the canonical pattern — finish in-flight requests within a context timeout, then exit.

stop_grace_period — extending the window #

For apps that can’t finish cleanly in 10 seconds (e.g., processing large file uploads):

compose
services:
  web:
    stop_grace_period: 60s
docker stop
docker stop --time 60 myapp

A common production tuning. But — too long slows deployments, and load balancers like ELB may already mark the backend unhealthy, making the long window pointless.

Restart policies — deeper #

The table from Intermediate #4, now from an operations angle.

PolicyWhen
noOne-shot containers (migrations, seeds, builds)
alwaysAlways — even on host boot
on-failure[:N]Non-zero exit code, with a max retry count
unless-stoppedAlways, unless explicitly stopped

Safe production default — unless-stopped #

The difference between always and unless-stopped confuses people. The distinction is what docker stop means:

  • With always, stopping with docker stop and restarting the daemon brings the container back.
  • With unless-stopped, a container stopped via docker stop stays stopped across daemon restarts.

It’s more reasonable to honor an operator’s explicit stop, so unless-stopped is the production default.

Restart loop — backoff #

If an app dies on startup every time, restart: always becomes an infinite loop. Docker prevents that with a backoff that increases the gap between restarts.

restart backoff
1st failure → retry immediately
2nd failure → wait 100ms
3rd failure → wait 200ms
...
Nth failure → up to 1 minute

For containers with many consecutive failures, follow the logs — docker logs --tail 200 <c> and check OOMKilled (#5).

Healthcheck — from operations #

The healthcheck from Intermediate #4, now from an operations angle.

Liveness vs. Readiness — two different questions #

Concepts from K8s, but equally useful as a thinking tool for Docker.

LivenessReadiness
QuestionIs it alive?Is it ready to receive traffic?
On failureRestart the containerBlock traffic (don’t restart)
ExampleStuck in deadlock → needs restartDB temporarily disconnected → just pause traffic

Docker only has one healthcheck, no distinction. So in Docker-only setups, mixing both concepts in one healthcheck is awkward.

Workaround — define healthcheck closer to liveness, and handle readiness inside the app. For example, return 503 from /health for the first N seconds after startup, then 200 once ready.

Properties of a good healthcheck #

Good healthcheck
□ Responds quickly (within 1s)
□ Doesn't recursively check downstream services
□ No side effects
□ Dedicated endpoint (/health) — separated from regular traffic
□ No auth (don't add an attack surface; access from inside only)
Bad healthcheck
✗ Runs DB queries — DB load
✗ Runs business logic — dependencies / load
✗ Calls external APIs — when an external dep goes down, your container goes unhealthy
✗ Requires auth — health checkers also need credentials

A healthcheck only needs to confirm the app is alive and able to serve a request. The health of dependencies belongs to separate monitoring.

Startup grace — start_period #

For apps with migrations or warm-up, a brief unhealthy period right after startup is normal.

With start_period
healthcheck:
  test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
  interval: 10s
  timeout: 5s
  retries: 3
  start_period: 60s   # failures during the first 60 seconds aren't counted

In K8s, this is startupProbe.

Logging — operational details #

The stdout principle and log drivers from Intermediate #6, now from an operations angle.

Log rotation #

Required options
services:
  web:
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "3"

Without these, unbounded disk growth is a classic Docker incident. Set them daemon-wide for safety.

/etc/docker/daemon.json
{
  "log-driver": "local",
  "log-opts": {
    "max-size": "10m",
    "max-file": "3"
  }
}

The local driver is an efficient json-file variant (compression + rotation by default). Becoming the production standard.

To external collectors #

As production scales, logs don’t stop at stdout — they flow to external collectors.

To fluentd
services:
  web:
    logging:
      driver: fluentd
      options:
        fluentd-address: localhost:24224
        tag: web.{{.Name}}

From there, route to Loki / Elasticsearch / CloudWatch — anywhere. One-paragraph mention.

Monitoring — a one-line extension #

cAdvisor + Prometheus + Grafana from #5 is the first monitoring setup for Docker-only operations. Common panels:

  • CPU / memory / network per container
  • Restart count (alarm on containers restarting often)
  • OOMKill events
  • Healthcheck failure rate
  • Disk IO

A first alarm rule: “the same container restarts 3+ times in 5 minutes.” Catch frequent OOMKills / crashes early.

Operations checklist #

The checklist for one container, gathering everything across the series:

Image / build
□ Multi-stage — separate build tools ([Intermediate #1])
□ Base: slim or distroless ([Intermediate #1], [#3])
□ Multi-arch — linux/amd64 + linux/arm64 ([#2])
□ Dockerfile: hadolint clean ([#3])
□ Image: Trivy HIGH/CRITICAL clean ([#3])
□ SBOM attached + cosign signed ([#4])
□ Build with buildx + external cache ([Intermediate #2], [#1])
Runtime / compose.yaml
□ image: digest or semver (no latest)
□ restart: unless-stopped
□ init: true (PID 1 handling)
□ stop_grace_period set (longer than the app's graceful time)
□ healthcheck — fast, light, no auth
□ Resources: mem_limit + cpus + pids_limit ([#5])
□ Security: read_only + tmpfs + cap_drop ALL + no-new-privileges ([#3])
□ Secrets: secrets: or external manager — never in ENV ([Intermediate #5], [#4])
□ Logs: max-size + max-file
□ DB / internal services: bind -p only to 127.0.0.1
□ Per-environment values: .env / override files ([Intermediate #4])
Deploy / CI
□ Build → multi-arch → SBOM → sign → push in one workflow ([#4])
□ Make verification a gate — Trivy / cosign verify
□ Tagging: semver + Git SHA + latest together ([Basics #5])
□ External cache: type=gha or type=registry ([Intermediate #2])

What’s next — Docker in Practice #

This series went deep on Docker itself. The next series — Docker in Practice — puts everything we’ve built into real app deploys:

  • FastAPI containerization — a production-grade Dockerfile
  • Django + PostgreSQL compose — admin / static / migration too
  • React/Next.js build container — standalone, multi-stage
  • Building images in CI — full GitHub Actions workflow
  • Registry push and tag strategy — operational details
  • Cloud deploy — one of Fly.io / Railway / ECS

A series where every tool we’ve built across Basics / Intermediate / Advanced finally comes together.

Wrap-up #

The picture from this post:

  • docker stop = SIGTERM → grace window → SIGKILL. PID 1’s signal handling is the core.
  • The PID 1 problem — apps weren’t designed for PID 1. Use init: true or dumb-init.
  • The app handles SIGTERM to finish in-flight work — Node’s server.close, gunicorn’s --graceful-timeout, Go’s srv.Shutdown.
  • stop_grace_period ensures enough time to clean up.
  • restart: unless-stopped is the production default; backoff prevents infinite loops.
  • Healthcheck: fast, light, closer to liveness. Dependency checks belong elsewhere.
  • Operations checklist split across image / runtime / deploy.
X