AWS Advanced #2: ECR — Image Registry

Sunday, April 26, 2026

9 min read

#1 ECS and Fargate covered container operations. One piece is missing — where do the images ECS / Fargate pulls actually live? External registries like Docker Hub work, but inside AWS the standard is Amazon ECR (Elastic Container Registry).

This post covers ECR’s place — private vs public, IAM auth, push / pull, security (scanning, tag immutability), and ops (lifecycle policies, multi-arch) — all in one go.

What an image registry is for #

Recall the Docker flow:

image lifecycle

docker build → local image
       ↓
docker push → registry (remote)
       ↓
docker pull (somewhere else) → bring it down, docker run

The registry sits in the middle. It’s what decides who can pull which image, from where, at which version.

The options #

Registry	Notes
Docker Hub	Most famous. Free, but with pull limits, public-by-default
GHCR (GitHub Container Registry)	Linked to GitHub accounts. Generous private free tier
Amazon ECR Private	IAM auth inside AWS. Plays naturally with ECS / Lambda / EKS
Amazon ECR Public	For OSS distribution. Anyone can pull anonymously
GCR / Azure ACR	Other clouds

If your ECS / Lambda / EKS runs on AWS, ECR is the standard:

IAM auth — no separate password to manage
VPC Endpoints to skip the internet on pull (NAT cost savings)
Same region → fast pulls
Image scanning (vulnerability analysis) integrated

Private vs Public #

Two flavors.

Private (most cases) #

Only your account’s users / roles can access. Almost all production work uses Private.

Region-scoped (each image is pinned to a region)
IAM policies for access control
Cost: GB stored + Data Transfer

Public (OSS distribution / learning) #

Anyone in the world can pull anonymously. Listed on the AWS-run Public Gallery.

Always hosted in us-east-1 (global)
Push requires IAM auth, pull is anonymous
Cost: GB stored + Data Transfer (push side)

This post assumes Private.

Creating a Repository #

The unit in ECR is a Repository. One repo holds many tags (versions) of the same app.

create a repo

aws ecr create-repository \
  --repository-name myapp \
  --region ap-northeast-2 \
  --image-scanning-configuration scanOnPush=true \
  --encryption-configuration encryptionType=AES256

On success the URI is:

123456789012.dkr.ecr.ap-northeast-2.amazonaws.com/myapp

That’s the address for every push / pull. Format:

ECR URI shape

<account-id>.dkr.ecr.<region>.amazonaws.com/<repo>:<tag>

Options #

Option	Meaning
`image-scanning-configuration scanOnPush=true`	Auto vulnerability scan on push
`image-tag-mutability IMMUTABLE`	Forbid overwriting the same tag — recommended for prod
`encryption-configuration encryptionType=KMS`	Encrypt with a customer-managed KMS key

You can do the same in the console GUI.

Auth — `aws ecr get-login-password` #

Unlike Docker Hub, ECR auth is via AWS IAM. There’s no separate password — instead you get a temporary token and docker login with it.

ECR login (12-hour validity)

aws ecr get-login-password --region ap-northeast-2 \
  | docker login --username AWS --password-stdin \
    123456789012.dkr.ecr.ap-northeast-2.amazonaws.com

The token is valid for 12 hours. In CI, re-acquire it at the start of every job.

Permissions you need #

policy on the user / role doing the push

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ecr:GetAuthorizationToken"
      ],
      "Resource": "*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "ecr:BatchCheckLayerAvailability",
        "ecr:InitiateLayerUpload",
        "ecr:UploadLayerPart",
        "ecr:CompleteLayerUpload",
        "ecr:PutImage",
        "ecr:BatchGetImage"
      ],
      "Resource": "arn:aws:ecr:ap-northeast-2:123456789012:repository/myapp"
    }
  ]
}

GetAuthorizationToken requires * (the only one that does); the rest you scope to the specific repo (Basics #6 least privilege).

Pull-only permissions #

For ECS Tasks (Execution Role) you only need pull. The AWS-managed policy AmazonECSTaskExecutionRolePolicy includes ECR pull permissions.

Push / Pull #

Push #

first push

# build
docker build -t myapp:v1 .

# tag (with the ECR URI)
docker tag myapp:v1 \
  123456789012.dkr.ecr.ap-northeast-2.amazonaws.com/myapp:v1

# login (see above)
aws ecr get-login-password --region ap-northeast-2 | docker login ...

# push
docker push \
  123456789012.dkr.ecr.ap-northeast-2.amazonaws.com/myapp:v1

If the image is 100 MB the first push uploads 100 MB. Subsequent pushes use layer-level caching — only the changed parts go up, usually a few MB.

Pull #

pull

docker pull 123456789012.dkr.ecr.ap-northeast-2.amazonaws.com/myapp:v1

ECS / Lambda do this automatically. You almost never pull manually from the console, but it’s useful for debugging.

Tagging strategies #

How you name versions of the same image inside the repo. Common patterns.

1) Semver #

myapp:1.4.2
myapp:1.4
myapp:1
myapp:latest

Natural for libraries / tools you publish externally. In production, latest is dangerous — it’s unclear which build “latest” actually refers to.

2) Git SHA #

myapp:abc1234        ← short sha
myapp:abc1234567...  ← full sha

1:1 with the commit your CI built. Most recommended for prod — you can immediately trace which commit is in production.

3) Environment + sequence #

myapp:prod-2025-04-01.001
myapp:staging-2025-04-01.005

Where releases are counted per day.

4) Multi-tag #

The recommended ops pattern: immutable + alias.

build once, two tags

docker tag myapp:abc1234 myapp:abc1234           # immutable (forever the same)
docker tag myapp:abc1234 myapp:prod-current      # mutable (points at current prod)

Make the ECR repo itself IMMUTABLE (can’t overwrite a pushed tag) and let a separate tool (your deployment system) manage aliases when needed.

Image scanning #

ECR auto-scans pushed images for vulnerabilities. Enable with scanOnPush=true (set above).

Two flavors #

Type	What	Cost
Basic Scanning	One-shot scan against open-source CVE DB (CoreOS Clair)	Free
Enhanced Scanning	Inspector integration. OS layer + language libraries (npm, pip, etc.). Continuous monitoring (alerts when new CVEs land after push)	per repo-hour / per image

Look at Enhanced for production workloads. Basic is enough to start.

Reading results #

scan findings

aws ecr describe-image-scan-findings \
  --repository-name myapp \
  --image-id imageTag=v1

In the console: repo → image → “Vulnerabilities” tab shows CRITICAL / HIGH / MEDIUM / LOW counts at a glance.

Block at the build stage #

Block deploys when CRITICAL is non-zero — in your CI job:

CI gate

CRITICAL=$(aws ecr describe-image-scan-findings \
  --repository-name myapp --image-id imageTag=$SHA \
  --query 'imageScanFindings.findingSeverityCounts.CRITICAL' \
  --output text)

if [ "$CRITICAL" != "None" ] && [ "$CRITICAL" -gt 0 ]; then
  echo "🚨 CRITICAL CVE found. Blocking deploy."
  exit 1
fi

Lifecycle policies — auto-cleanup #

Images pile up; storage costs follow. Use a lifecycle policy to auto-clean.

lifecycle.json

{
  "rules": [
    {
      "rulePriority": 1,
      "description": "Delete untagged images after 7 days",
      "selection": {
        "tagStatus": "untagged",
        "countType": "sinceImagePushed",
        "countUnit": "days",
        "countNumber": 7
      },
      "action": { "type": "expire" }
    },
    {
      "rulePriority": 2,
      "description": "Keep only the latest 30 (delete the rest)",
      "selection": {
        "tagStatus": "any",
        "countType": "imageCountMoreThan",
        "countNumber": 30
      },
      "action": { "type": "expire" }
    }
  ]
}

apply

aws ecr put-lifecycle-policy \
  --repository-name myapp \
  --lifecycle-policy-text file://lifecycle.json

Common patterns:

Delete untagged after 7–14 days (leftovers from failed builds)
Delete pr- prefixed tags after 30 days (PR preview images)
Keep release- prefixed forever

After a year in production, an unmanaged repo accumulates GBs of stored images and the cost that goes with them. Setting a lifecycle policy when you create the repo is part of operational hygiene.

Multi-architecture images #

A linux/arm64 image built on Apple Silicon won’t boot on amd64 production. Two paths.

1) buildx multi-platform push #

buildx multi-arch

docker buildx create --use
docker buildx build \
  --platform linux/amd64,linux/arm64 \
  -t 123456789012.dkr.ecr.ap-northeast-2.amazonaws.com/myapp:v1 \
  --push .

A single ECR tag (v1) holds a manifest list with both architectures. Pull side automatically picks the matching arch.

2) Standardize on Fargate ARM #

Set #1 Fargate’s runtimePlatform.cpuArchitecture: ARM64 in your Task Definition and you only need ARM images. Bonus: about 20% cheaper.

Task Definition (ARM)

{
  "runtimePlatform": {
    "cpuArchitecture": "ARM64",
    "operatingSystemFamily": "LINUX"
  }
}

For a new small- to mid-traffic project, going ARM from day one is the right call.

VPC Endpoint — pull without NAT #

ECS Tasks in private subnets pulling from ECR → by default through NAT Gateway → per-GB charge.

ECR supports VPC Endpoints to skip NAT:

three ECR VPC Endpoints

# api calls
aws ec2 create-vpc-endpoint \
  --vpc-id vpc-xxx \
  --service-name com.amazonaws.ap-northeast-2.ecr.api \
  --vpc-endpoint-type Interface \
  --subnet-ids subnet-aaa subnet-bbb

# image layer downloads
aws ec2 create-vpc-endpoint \
  --vpc-id vpc-xxx \
  --service-name com.amazonaws.ap-northeast-2.ecr.dkr \
  --vpc-endpoint-type Interface \
  --subnet-ids subnet-aaa subnet-bbb

# image layers live in S3 — add an S3 endpoint
aws ec2 create-vpc-endpoint \
  --vpc-id vpc-xxx \
  --service-name com.amazonaws.ap-northeast-2.s3 \
  --vpc-endpoint-type Gateway \
  --route-table-ids rtb-xxx

The three (api, dkr, s3) form a set. Cuts NAT Gateway costs significantly — almost mandatory at scale.

Cross-account access #

Want to pull from the prod account’s ECR repo into a dev account? Use a Repository Policy to allow it.

repo-policy.json

{
  "Version": "2008-10-17",
  "Statement": [
    {
      "Sid": "AllowDevAccountPull",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::222222222222:root"
      },
      "Action": [
        "ecr:GetAuthorizationToken",
        "ecr:BatchCheckLayerAvailability",
        "ecr:GetDownloadUrlForLayer",
        "ecr:BatchGetImage"
      ]
    }
  ]
}

aws ecr set-repository-policy \
  --repository-name myapp \
  --policy-text file://repo-policy.json

Principle: production images live once in the prod ECR; other accounts pull. Don’t rebuild the same image into per-environment ECRs.

Cost #

Item	Price (Seoul region)
Storage	GB / month $0.10
Data Transfer Out (internet)	GB $0.126 (1GB free)
Data Transfer Out (same region)	Free
Enhanced Scanning	per image + per repo-hour

ECS pulls in the same region → free. Only internet egress (CI / external tools) pulls cost. With a lifecycle policy your storage stays small and the bill is essentially nothing.

Common pitfalls #

1) `denied: User: ... is not authorized to perform: ecr:...` #

Permission missing. Check both:

ECR actions are in the user / role policy
The Repository Policy isn’t blocking (usually empty — empty means IAM policy is enough)

2) `manifest unknown` or `repository ... not found` #

99% the time it’s wrong region or account ID in the URI. Double-check the ap-northeast-2 part and the 123456789012 part.

3) Pushing the same tag against an `IMMUTABLE` repo #

Pushing the same tag twice gets denied. This is intentional — recommended for prod. Work around it by tagging with the commit SHA in your CI job.

4) Forgot multi-architecture #

Build on Mac (ARM) → push → exec format error on x86_64 Fargate. Use buildx multi-arch, or standardize the Task Definition on ARM.

5) NAT Gateway cost blow-up #

ECR pulls through NAT add up by GB. Add the three VPC Endpoints (api / dkr / s3).

6) Image accumulation #

Without a lifecycle policy, a year of production gives you thousands of images and GB of cost. Add a lifecycle policy when you create the repo.

Wrap-up #

Here is what this post covered:

Where ECR fits — AWS’s image registry. Plays naturally with ECS / Lambda / EKS via IAM
Private vs Public — production is Private. Public is for OSS distribution
Repository = one app’s image collection. URI shape <account>.dkr.ecr.<region>.amazonaws.com/<repo>:<tag>
Auth — aws ecr get-login-password → docker login. 12-hour token
Permissions — split push (full) from pull (AmazonECSTaskExecutionRolePolicy)
Tagging strategies — Semver / Git SHA / env+sequence. Production goes Git SHA + IMMUTABLE
Image scanning — Basic (free) / Enhanced (Inspector, continuous). CI gate on CRITICAL
Lifecycle policy — rules like “untagged for 7 days, keep latest N” auto-clean
Multi-architecture — buildx for amd64 + arm64. Or standardize on Fargate ARM (20% cheaper)
VPC Endpoint (api / dkr / s3) — skip NAT Gateway. Almost mandatory in production
Cross-account — Repository Policy for prod ↔ dev pull
Pitfalls — permission, URI typo, IMMUTABLE conflict, missing multi-arch, NAT cost, image sprawl

Up next — Lambda #

ECS / ECR are the model where a container is always running. Next we look at the opposite — the function wakes only when called, the serverless side.

In #3 Lambda Basics we cover where Lambda fits (vs ECS / EC2), runtime / handler / event model, cold start, concurrency, and logging — AWS’s first serverless piece, all in one go.