Docker in Practice #4: Building Images in CI — GitHub Actions and BuildKit Cache

8 min read

Up to now, every build was local. Now we move it to CI — code push → automatic build → registry push → (next post) deploy.

This post in Docker in Practice:

This post is GitHub Actions–oriented, but the patterns transfer almost directly to GitLab CI / CircleCI. The core: BuildKit + cache + multi-arch.

What’s hard about Docker builds in CI — caches vanish #

Locally, the second build of the same image finishes nearly instantly because the Docker daemon keeps a layer cache on disk (Intermediate #2 build cache).

CI is different. Each workflow starts on a fresh VM. No cache, so it pulls the base image again, reinstalls deps from scratch, rebuilds from scratch. A Next.js project can routinely take 5–8 minutes.

Two tools fix this:

  1. docker/setup-buildx-action — install a BuildKit builder on the GHA runner.
  2. type=gha cache — use GitHub Actions’ cache as BuildKit’s cache backend so layer caches survive across workflows.

With both in place, builds from the second run on hit speeds close to local.

The simplest workflow #

.github/workflows/docker.yml:

.github/workflows/docker.yml — first version
name: Build and push image

on:
  push:
    branches: [main]
  pull_request:

jobs:
  build:
    runs-on: ubuntu-latest

    permissions:
      contents: read
      packages: write   # required for GHCR push

    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Set up Buildx
        uses: docker/setup-buildx-action@v3

      - name: Login to GHCR
        if: github.event_name == 'push'
        uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Build and push
        uses: docker/build-push-action@v6
        with:
          context: .
          push: ${{ github.event_name == 'push' }}
          tags: ghcr.io/${{ github.repository }}:latest
          cache-from: type=gha
          cache-to: type=gha,mode=max

What this does:

  • Triggers on on.push.branches: [main] and pull_request. PRs only build; main pushes also push.
  • permissions.packages: write — the default token needs packages write to push to GHCR.
  • setup-buildx-action installs the BuildKit builder.
  • login-action logs into GHCR. Skipped on PRs since we don’t push there.
  • build-push-action’s cache-from/cache-to is the heart — save and load layers via the GHA cache.

mode=max saves all intermediate layers to the cache. The default mode=min only saves the final result, which loses efficiency in multi-stage builds. Use max.

This workflow updates only :latest per push. Not enough for production — next section is the tag strategy.

Auto-generating tags — docker/metadata-action #

With only :latest, “which commit is in production?” is untraceable. Auto-tagging multiple is the standard.

With metadata-action
- name: Extract metadata
  id: meta
  uses: docker/metadata-action@v5
  with:
    images: ghcr.io/${{ github.repository }}
    tags: |
      type=ref,event=branch        # branch name: main
      type=ref,event=pr            # PR: pr-123
      type=sha,prefix=sha-,format=short  # commit: sha-a1b2c3d
      type=semver,pattern={{version}}     # on tag push: 1.2.3
      type=semver,pattern={{major}}.{{minor}}  # 1.2
      type=raw,value=latest,enable={{is_default_branch}}

- name: Build and push
  uses: docker/build-push-action@v6
  with:
    context: .
    push: ${{ github.event_name == 'push' }}
    tags: ${{ steps.meta.outputs.tags }}
    labels: ${{ steps.meta.outputs.labels }}
    cache-from: type=gha
    cache-to: type=gha,mode=max

What metadata-action does:

  • Auto-generates per the tags pattern. A main push gets main, sha-a1b2c3d, latest at once.
  • Also auto-generates labels — OCI standard labels (org.opencontainers.image.source, etc.) so the GHCR package page links to the repo.

The next post goes deeper into tag strategy — for now it’s enough that “metadata-action handles it.”

Multi-arch — amd64 + arm64 #

With Apple Silicon developers everywhere, multi-arch builds are essentially mandatory (Advanced #2).

Add multi-arch
- name: Set up QEMU
  uses: docker/setup-qemu-action@v3

- name: Set up Buildx
  uses: docker/setup-buildx-action@v3

- name: Build and push
  uses: docker/build-push-action@v6
  with:
    context: .
    platforms: linux/amd64,linux/arm64
    push: ${{ github.event_name == 'push' }}
    tags: ${{ steps.meta.outputs.tags }}
    cache-from: type=gha
    cache-to: type=gha,mode=max

New pieces:

  • setup-qemu-action — GHA runners are amd64, so arm64 builds need QEMU emulation. This action registers binfmt for you.
  • platforms: linux/amd64,linux/arm64 — buildx builds both at once and pushes them as a single manifest.

QEMU emulation is slow — 3–5× a native build. An amd64-only 1-minute build can become 4–5 minutes for amd64 + arm64. If your image only deploys to amd64 cloud, don’t bother with arm64.

If you really need fast multi-arch, you can run on ARM runners (runs-on: [self-hosted, arm64]) for native builds — but that brings infra cost.

Build-time secrets — --secret #

Non-secret values like NEXT_PUBLIC_API_URL are fine via --build-arg (#3). For real build-time secrets (private npm registry tokens, GitHub PATs), use --secret instead of --build-arg. --build-arg values end up in image history in plaintext.

Dockerfile — using a secret
FROM node:22-alpine AS deps
WORKDIR /app
COPY package.json pnpm-lock.yaml ./
# Mounted only at build, never in the image
RUN --mount=type=secret,id=npmrc,target=/root/.npmrc \
    corepack enable && pnpm install --frozen-lockfile
Workflow — inject secret
- name: Build and push
  uses: docker/build-push-action@v6
  with:
    context: .
    push: ${{ github.event_name == 'push' }}
    tags: ${{ steps.meta.outputs.tags }}
    secrets: |
      npmrc=${{ secrets.NPMRC_TOKEN }}
    cache-from: type=gha
    cache-to: type=gha,mode=max

Runtime secrets (DATABASE_URL, etc.) don’t belong at build time — those go in your cloud’s env / secret manager (#6).

Reducing build time — common mistakes #

Slow builds drag PR cycles and eventually go unused. Common cases:

Cache doesn’t persist — missing cache-to: type=gha,mode=max or workflows producing different cache keys each time. type=gha automatically separates by workflow + branch.

COPY . . too early — any code change cache-misses everything below. Copy dependency manifests (package.json, pyproject.toml, go.mod) first, install, then COPY . . (the pattern from #1 and #2).

Empty .dockerignore — sending node_modules, .git, coverage, .next as build context wholesale adds minutes. If “Sending build context to Docker daemon” is long, that’s the signal.

Multiple services built serially in one workflow — parallelize with matrix.

Parallel image builds via matrix
jobs:
  build:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        service: [api, web]
    steps:
      - uses: actions/checkout@v4
      - uses: docker/setup-buildx-action@v3
      - uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
      - id: meta
        uses: docker/metadata-action@v5
        with:
          images: ghcr.io/${{ github.repository }}-${{ matrix.service }}
          tags: |
            type=ref,event=branch
            type=sha,prefix=sha-,format=short
      - uses: docker/build-push-action@v6
        with:
          context: ./${{ matrix.service }}
          push: ${{ github.event_name == 'push' }}
          tags: ${{ steps.meta.outputs.tags }}
          cache-from: type=gha,scope=${{ matrix.service }}
          cache-to: type=gha,mode=max,scope=${{ matrix.service }}

Distinct scope values are essential — without them, two services collide on the same cache space.

Attest and SBOM — one more rung of supply chain security #

Another layer to stack at CI build time (Advanced #4 SBOM and signing).

attestations + SBOM
- uses: docker/build-push-action@v6
  with:
    context: .
    push: true
    tags: ${{ steps.meta.outputs.tags }}
    sbom: true                # generate SBOM
    provenance: mode=max      # build provenance
    cache-from: type=gha
    cache-to: type=gha,mode=max

provenance pushes “this image was built by which workflow at which commit” as an attestation. Used later when automating supply chain verification. Almost no downside to leaving it on.

Full workflow — in one place #

The pieces, all in one file. Copy-paste worthy as a starting point.

.github/workflows/docker.yml — final
name: Build and push image

on:
  push:
    branches: [main]
    tags: ['v*.*.*']
  pull_request:

jobs:
  build:
    runs-on: ubuntu-latest

    permissions:
      contents: read
      packages: write
      id-token: write   # required for provenance

    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Set up QEMU
        uses: docker/setup-qemu-action@v3

      - name: Set up Buildx
        uses: docker/setup-buildx-action@v3

      - name: Login to GHCR
        if: github.event_name == 'push'
        uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Extract metadata
        id: meta
        uses: docker/metadata-action@v5
        with:
          images: ghcr.io/${{ github.repository }}
          tags: |
            type=ref,event=branch
            type=ref,event=pr
            type=sha,prefix=sha-,format=short
            type=semver,pattern={{version}}
            type=semver,pattern={{major}}.{{minor}}
            type=raw,value=latest,enable={{is_default_branch}}

      - name: Build and push
        uses: docker/build-push-action@v6
        with:
          context: .
          platforms: linux/amd64,linux/arm64
          push: ${{ github.event_name == 'push' }}
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
          cache-from: type=gha
          cache-to: type=gha,mode=max
          sbom: true
          provenance: mode=max

Other cache backends — when type=gha isn’t enough #

type=gha is the easiest within GHA, but two limits:

  • Cache size — GHA cache totals 10GB per repo. Large images get LRU-evicted and miss often.
  • Doesn’t work outside the workflow — local builds can’t pull type=gha cache.

Alternatives:

  • type=registry — push the cache itself as a separate image tag. Shareable everywhere.

    cache-to: type=registry,ref=ghcr.io/.../cache,mode=max
    cache-from: type=registry,ref=ghcr.io/.../cache

    Adds registry cost but effectively no size limit.

  • type=s3 / type=gcs — cloud storage as cache. Good for standardizing across a large org.

For small projects type=gha is enough. Move when you actually see frequent cache misses.

Common pitfalls #

“buildx not found” — missing setup-buildx-action. Must precede every Docker build step.

permission denied on GHCR push — missing permissions.packages: write. Or the organization’s GHCR settings block write (Settings → Actions → General → Workflow permissions).

Cache exists but rebuilds from scratch — something in the Dockerfile is non-deterministic (timestamps, etc.). The package index fetched by RUN apt-get update && apt-get install -y ... changes over time. Move the apt cache to a BuildKit cache mount.

RUN --mount=type=cache,target=/var/cache/apt \
    apt-get update && apt-get install -y curl

arm64 builds take over 5 minutes — QEMU’s ceiling. If you build very often, consider ARM runners.

Secrets unavailable on PRs — forked PRs are blocked from secrets by security design. You can use pull_request_target for trusted contributors only, but it comes with serious security gotchas — not recommended.

Wrap-up #

  • The core of CI Docker builds is BuildKit + GHA cache (type=gha,mode=max). Builds from the second run feel local-fast.
  • Tags via docker/metadata-action — branch/PR/SHA/semver/latest in one stroke.
  • Multi-arch (amd64 + arm64) is setup-qemu + platforms. QEMU is slow — skip arm64 if not needed.
  • Build-time secrets via --secret, not --build-arg. Doesn’t end up in the image.
  • For multiple services in one repo, parallelize with matrix + distinct cache scope.
  • sbom: true + provenance: mode=max is usually a net win — leave them on.

In the next post (#5 Registry push and tag strategy) we go deep on tag strategy: meanings and traps of semver / sha / latest, GHCR vs Docker Hub vs ECR, image retention policies.

X