Docker Basics #6: .dockerignore and the Build Context — Using the Cache Well
The final post of the Docker Basics series. We’ve containerized a small app, run it, kept its data, and pushed it to a registry. This post drills into one problem — why images get fat and why builds get slow — and the two tools that fix both: .dockerignore and the layer cache.
This post in the Docker Basics series:
- #1 What is a container
- #2 Writing your first Dockerfile
- #3 Images and containers
- #4 Volumes and networks
- #5 Registries — Docker Hub, GHCR, push/pull
- #6
.dockerignoreand the build context ← this post
The build context — what Docker does before the build starts #
When you run docker build -t myapp ., the trailing dot is the build context. Before the build starts, Docker bundles up the entire directory and ships it to the daemon.
host docker daemon
┌──────────────────┐ ┌─────────────────┐
│ ./hello-docker │ │ │
│ ├ app.py │ tar archive │ │
│ ├ requirements │ ───────────▶ │ build │
│ ├ Dockerfile │ │ │
│ ├ ... │ │ │
│ └ .git/ │ └─────────────────┘
└──────────────────┘That’s the line you see in build output:
=> [internal] load build context 500ms
=> => transferring context: 152MBThe bigger that 152MB:
- The slower the build is to start.
- The more memory / disk Docker daemon holds it in.
- The more work goes into change detection (cache key calculation).
- And most importantly — files in there can be baked into the image by an instruction like
COPY . ..
Two things to keep separate:
- What ships as context — every file the daemon receives.
- What ends up in the image — only what you explicitly bring in via
COPY/ADD.
.dockerignore trims the first of those.
.git, node_modules, .venv — what to exclude
#
The numbers are often surprising when you look at a small project’s directory:
du -sh .
# 152M .
du -sh .git node_modules .venv build dist *.log
# 80M .git
# 60M node_modules # will be reinstalled inside the container
# 8M .venv # same
# 4M build
# 1M dist
# 200K *.logAlmost none of that is needed inside the container. Dependencies will be reinstalled by RUN pip install / RUN npm ci inside, so there’s zero reason to send your host’s node_modules or .venv. .git is usually excluded too (if the build needs git info, pass it as --build-arg GIT_SHA=... — cleaner).
Writing .dockerignore
#
The syntax is almost the same as .gitignore. Create a .dockerignore next to your Dockerfile and list patterns to exclude from the build context.
# version control
.git
.gitignore
# env / secrets
.env
.env.*
*.key
*.pem
# build artifacts
dist/
build/
out/
*.egg-info/
# dependencies (reinstalled inside the container)
node_modules/
.venv/
__pycache__/
*.pyc
# editor / OS
.vscode/
.idea/
.DS_Store
Thumbs.db
# logs
*.log
logs/
# test / cache
.pytest_cache/
.mypy_cache/
.ruff_cache/
coverage/
.coverage
htmlcov/
# Docker tooling
Dockerfile.dev
docker-compose*.yml
.dockerignoreThe Dockerfile itself and .dockerignore are part of the context too — no need for them inside the image, so it’s cleaner to add them to the ignore list.
Pattern syntax in one table #
| Pattern | Meaning |
|---|---|
node_modules | A file or directory with this name anywhere |
node_modules/ | Only the directory |
*.log | All .log files |
**/*.log | .log at any depth (Docker recurses by default, so practically the same as *.log) |
dist/** | Everything inside dist |
!important.log | An explicit exception to the rules above — include this file |
The ! exception is powerful but easy to misuse. Stick to plain ignores as much as you can.
Verifying the effect #
Before / after, the build output shows the size drop directly:
=> [internal] load build context 1.2s
=> => transferring context: 152MB
# After adding .dockerignore
=> [internal] load build context 120ms
=> => transferring context: 240kBThe difference is tangible. CI pulling the context onto a build machine each time makes the savings even more obvious.
Layer cache — making builds fast #
The second pillar of Docker builds is the cache. For each line in your Dockerfile (i.e. each layer), Docker asks:
Is the input to this instruction (the previous layer + the instruction itself + any files it references) the same as last build? → If yes, reuse the cache; if no, rebuild from this layer down.
This model is decisive for build speed. And — once one layer breaks, every layer after it breaks too (since they stack top-down).
Where the cache breaks #
FROM python:3.14-slim
WORKDIR /app
COPY . . # code and the dependency manifest land together
RUN pip install -r requirements.txt # any code change above busts this layerThis Dockerfile reinstalls dependencies for every single code change. COPY . . produces different output, so the cache key for the next RUN changes. Big projects? Builds get minutes longer every time.
Right order — separate dependencies and code #
FROM python:3.14-slim
WORKDIR /app
# 1) Copy only the dependency manifest first
COPY requirements.txt .
# 2) Install — cache reused as long as requirements.txt is unchanged
RUN pip install --no-cache-dir -r requirements.txt
# 3) Then copy the code
COPY . .
CMD ["python", "app.py"]Now if you only change app.py:
COPY requirements.txt .→ cache hitRUN pip install ...→ cache hit (skip the install entirely)COPY . .→ re-runs- The whole build finishes in seconds.
Things that don’t change go up; things that change often go down. That’s the first heuristic for writing Dockerfiles. Order: base image (rarely changes) → system deps → language deps → code (changes often).
Same pattern for Node.js #
FROM node:20-slim
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
CMD ["node", "server.js"]Copy package.json / package-lock.json first → install with npm ci → then code. Same idea.
A side effect of the cache — image size #
Layer cache affects more than build speed; it shapes image size. Once a layer is baked, anything created during it is locked in.
RUN apt-get update
RUN apt-get install -y curl
RUN apt-get cleanThese three lines become three layers. The package index from apt-get update is baked into the first; deleting it later in another layer doesn’t actually shrink the image.
Combine into one line and you get one layer:
RUN apt-get update && \
apt-get install -y --no-install-recommends curl && \
rm -rf /var/lib/apt/lists/*Combine with &&, clean up the cache directory at the end — all in one layer. Image size differs noticeably:
| Split | Combined | |
|---|---|---|
| Image size | ~180MB | ~85MB |
You’ll see this pattern in nearly every official base image’s Dockerfile.
--no-cache — force a fresh build
#
When the cache is hanging on to something stale:
docker build --no-cache -t myapp .To pull base images fresh too:
docker build --pull -t myapp .
docker build --pull --no-cache -t myapp . # bothA stale apt-get update index can leave you missing security patches. Periodic CI builds often force --no-cache or --pull on a schedule.
Wrapping the series — one container’s full cycle #
What you’ve built up over Basics, at a glance:
FROM python:3.14-slim
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1
WORKDIR /app
# System deps (one layer with cleanup)
RUN apt-get update && \
apt-get install -y --no-install-recommends curl && \
rm -rf /var/lib/apt/lists/*
# Language deps (rarely change — go up)
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Code (changes often — goes down)
COPY app.py .
EXPOSE 8000
CMD ["python", "app.py"].git
.env
.env.*
node_modules/
.venv/
__pycache__/
*.pyc
*.log
.pytest_cache/
.mypy_cache/
.ruff_cache/
.DS_Store
Dockerfile.dev
docker-compose*.yml# 1) Build (the command CI runs most)
docker build -t ghcr.io/curtis/myapp:1.0.0 -t ghcr.io/curtis/myapp:latest .
# 2) Local sanity check
docker run --rm -p 8000:8000 ghcr.io/curtis/myapp:1.0.0
# 3) Production run — daemon mode + persistent data + restart policy
docker run -d --name myapp \
--restart unless-stopped \
--network mynet \
-p 127.0.0.1:8000:8000 \
-v myapp-data:/app/data \
-e DB_HOST=pg \
ghcr.io/curtis/myapp:1.0.0
# 4) Pull on another machine
docker pull ghcr.io/curtis/myapp:1.0.0Once this flow is in your hands — define → build → run → ship — that’s the destination of Docker Basics.
What’s next — Docker Intermediate #
Basics covered making and running a single container well. The next series steps further into multiple containers + deeper builds. What it covers:
- Multi-stage builds — separate build deps from runtime deps and slim the image
- Docker Compose — define
web + db + cachein one file and start them together - healthcheck, depends_on, profiles — Compose’s operational features
- Build cache deep dive — BuildKit, mount cache — share
pip install/npm cicaches across builds - Logging and debugging — viewing logs from many containers in one place
- Environment variables and secrets — handling secrets without baking them into the image
It builds on the single-container habits you’ve now formed, and steps closer to a production environment. Everything from this series — layers, cache, volumes, networks — stacks underneath it.
Wrap-up #
The 6 Basics posts in one line each:
- #1 — Container vs. VM, the bones of the Docker ecosystem
- #2 — Your first Dockerfile with
FROM / RUN / COPY / CMD - #3 — Day-to-day commands:
build / run / ps / logs / exec / stop / rm - #4 — bind mount / named volume, user-defined bridge networks
- #5 — Docker Hub / GHCR,
tag / push / pull, digests - #6 —
.dockerignore, build context, the order that keeps the layer cache alive
The Docker track has four series. Next up is Docker Intermediate — Compose and multi-stage builds.