K8s Practice #4: CI/CD Pipeline — GitHub Actions / ECR / ArgoCD

The fourth post in the K8s Practice series. Through #3, myshop-api became a complete service equipped with EKS, RDS, Secrets, and a connection pool, but new releases still depend on manual steps. Someone builds the container and pushes it, someone changes the image tag in the manifest, and someone runs helm upgrade. This post turns that entire flow into code. GitHub Actions pushes images to ECR via OIDC trust without static keys, auto-commits Helm values in the manifest repo, and the ArgoCD covered in Advanced #6 watches the change and syncs it to the cluster.

This series is K8s Practice, 6 posts.

Two-repo model — separation of code and manifests #

The most common GitOps pattern is the separation of two repos.

repoRole
myshop-api (application repo)Source code, Dockerfile, GitHub Actions workflow
myshop-manifests (manifest repo)Helm values, Application manifests, per-environment config

This separation has three benefits:

  • Permission separation — code changes and infra/deployment changes can have different reviewers
  • Clarity of changes — looking at git log clearly shows “which version was up in prod at this point”
  • ArgoCD watches one place — watching only the manifest repo captures the desired state of every environment

The code push flow fits in a single diagram.

One cycle of GitOps
[developer push] → [GitHub Actions: build/test/ECR push]
                 → [auto-commit image tag in manifest repo]
                 → [ArgoCD detects change]
                 → [new version deployed to cluster]

Each step is covered in its own section below.

GitHub Actions — dynamic AWS credentials via OIDC #

The old way to call the AWS API from GitHub Actions was to store an IAM user’s access key and secret key in GitHub Secrets. The problems with this approach are clear — keys are static so rotation is difficult, and the blast radius of a leak is large.

The new standard is OIDC trust. GitHub Actions issues a JWT token, AWS IAM verifies it, and temporary credentials are issued — the same pattern as Advanced #2 IRSA.

OIDC provider registration (Terraform) #

terraform/modules/github-oidc/main.tf
resource "aws_iam_openid_connect_provider" "github" {
  url             = "https://token.actions.githubusercontent.com"
  client_id_list  = ["sts.amazonaws.com"]
  thumbprint_list = ["6938fd4d98bab03faadb97b34396831e3780aea1"]
}

resource "aws_iam_role" "github_actions_ecr_push" {
  name = "github-actions-myshop-api-ecr-push"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect = "Allow"
      Principal = {
        Federated = aws_iam_openid_connect_provider.github.arn
      }
      Action = "sts:AssumeRoleWithWebIdentity"
      Condition = {
        StringEquals = {
          "token.actions.githubusercontent.com:aud" = "sts.amazonaws.com"
        }
        StringLike = {
          "token.actions.githubusercontent.com:sub" = "repo:myshop/myshop-api:ref:refs/heads/main"
        }
      }
    }]
  })
}

resource "aws_iam_role_policy_attachment" "ecr_push" {
  role       = aws_iam_role.github_actions_ecr_push.name
  policy_arn = aws_iam_policy.ecr_push.arn
}

The sub field in Condition is the key — only workflows triggered from the main branch of the myshop/myshop-api repo can assume this Role. All other repos, branches, and forks are rejected.

Workflow — build and push #

.github/workflows/build.yml
name: Build and push

on:
  push:
    branches: [main]
    tags: ['v*']

permissions:
  id-token: write    # needed for OIDC token issuance
  contents: read

env:
  AWS_REGION: ap-northeast-2
  ECR_REPOSITORY: myshop-api

jobs:
  build:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - name: Set image tag
        id: meta
        run: |
          if [[ "$GITHUB_REF" == refs/tags/v* ]]; then
            echo "tag=${GITHUB_REF#refs/tags/v}" >> $GITHUB_OUTPUT
          else
            echo "tag=main-$(git rev-parse --short HEAD)" >> $GITHUB_OUTPUT
          fi

      - name: Configure AWS credentials (OIDC)
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789012:role/github-actions-myshop-api-ecr-push
          aws-region: ${{ env.AWS_REGION }}

      - name: Login to ECR
        uses: aws-actions/amazon-ecr-login@v2

      - name: Build and push
        uses: docker/build-push-action@v5
        with:
          context: .
          push: true
          tags: |
            123456789012.dkr.ecr.ap-northeast-2.amazonaws.com/${{ env.ECR_REPOSITORY }}:${{ steps.meta.outputs.tag }}
          cache-from: type=gha
          cache-to: type=gha,mode=max

      - name: Update manifest repo
        env:
          GH_TOKEN: ${{ secrets.MANIFESTS_REPO_TOKEN }}
        run: |
          gh api repos/myshop/myshop-manifests/dispatches \
            -f event_type=update-image \
            -F client_payload[app]=myshop-api \
            -F client_payload[tag]=${{ steps.meta.outputs.tag }} \
            -F client_payload[env]=dev

Three key steps:

  • Configure AWS credentials (OIDC) — assumes the IAM Role above via AssumeRoleWithWebIdentity over OIDC. This single step obtains temporary credentials without any static keys.
  • Build and push — multi-platform build via Docker buildx, then pushed to ECR. Layer caching via GHA cache.
  • Update manifest repo — triggers another workflow in the manifest repo via repository_dispatch event. That workflow auto-commits Helm values.

Auto-commit in the manifest repo #

In the manifest repo, place a workflow that receives the dispatch event and updates the values files.

myshop-manifests/.github/workflows/update-image.yml
name: Update image tag

on:
  repository_dispatch:
    types: [update-image]

jobs:
  update:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Update values
        run: |
          APP=${{ github.event.client_payload.app }}
          TAG=${{ github.event.client_payload.tag }}
          ENV=${{ github.event.client_payload.env }}

          yq -i ".image.tag = \"$TAG\"" charts/$APP/values-$ENV.yaml

      - name: Commit and push
        run: |
          git config user.name "github-actions[bot]"
          git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
          git add charts/
          git commit -m "chore: bump ${{ github.event.client_payload.app }} to ${{ github.event.client_payload.tag }} (${{ github.event.client_payload.env }})"
          git push

When this commit enters the main branch of the manifest repo, ArgoCD watches that change and auto-syncs to the cluster.

ArgoCD — watcher of the manifest repo #

We’ll use the ArgoCD setup covered in Advanced #6 as-is. A single Application CRD manifest handles the deployment of myshop-api for one environment.

ArgoCD installation #

ArgoCD Helm install
helm repo add argo https://argoproj.github.io/argo-helm
helm install argocd argo/argo-cd \
  -n argocd --create-namespace \
  --values argocd-values.yaml
argocd-values.yaml — partial
server:
  ingress:
    enabled: true
    ingressClassName: alb
    annotations:
      alb.ingress.kubernetes.io/scheme: internet-facing
      alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}]'
      alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:...
    hosts:
      - argocd.myshop.example.com

configs:
  cm:
    timeout.reconciliation: 30s

The ArgoCD UI is exposed at argocd.myshop.example.com. In production environments, integrating SSO (GitHub, Google) is standard practice.

myshop-api Application #

argocd/applications/myshop-api-prod.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: myshop-api-prod
  namespace: argocd
spec:
  project: myshop

  source:
    repoURL: https://github.com/myshop/myshop-manifests.git
    targetRevision: main
    path: charts/myshop-api
    helm:
      valueFiles:
        - values.yaml
        - values-prod.yaml

  destination:
    server: https://kubernetes.default.svc
    namespace: myshop

  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
      - CreateNamespace=true
      - PrunePropagationPolicy=foreground
      - ServerSideApply=true
    retry:
      limit: 5
      backoff:
        duration: 5s
        maxDuration: 3m
  • automated — git changes are immediately reflected in the cluster. Suitable for dev.
  • selfHeal: true — even if someone modifies a resource directly via kubectl edit, it is automatically restored to the git manifest.
  • prune: true — objects removed from git are also removed from the cluster.

dev vs prod — auto sync branching #

A common pattern is to disable auto sync for prod and rely on a manual trigger instead.

myshop-api-prod.yaml — manual sync
syncPolicy:
  syncOptions:
    - CreateNamespace=true
    - ServerSideApply=true
  # remove the automated section → manual sync mode

The deploy flow branches as follows.

dev vs prod deploy flow
[dev]
git push → GitHub Actions build → ECR push
        → manifest repo commit (values-dev.yaml)
        → ArgoCD auto sync → dev cluster deploy

[prod]
git tag v1.5.0 → GitHub Actions build → ECR push
              → manifest repo commit (values-prod.yaml)
              → person clicks "Sync" in ArgoCD UI
              → prod cluster deploy

The human gate on prod deploys is the safety net. The manifest itself is reviewed via a git PR, and the actual sync is confirmed once more by an operator.

Standard for Application bundles — App of Apps #

Rather than applying Application manifests to ArgoCD by hand, a single root Application watches all the other Applications.

argocd/root.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: root
  namespace: argocd
spec:
  source:
    repoURL: https://github.com/myshop/myshop-manifests.git
    targetRevision: main
    path: argocd/applications
  destination:
    server: https://kubernetes.default.svc
    namespace: argocd
  syncPolicy:
    automated:
      prune: true

When a new Application is added to the argocd/applications/ directory, it is automatically registered with ArgoCD, and that Application syncs its own manifests. This way, even cluster-level operations fall under GitOps.

Image Updater — moving image tag updates to ArgoCD #

In the flow above, GitHub Actions commits to the manifest repo to update image tags. ArgoCD Image Updater is an option that shifts this responsibility to ArgoCD itself.

myshop-api-prod.yaml — Image Updater annotation
metadata:
  annotations:
    argocd-image-updater.argoproj.io/image-list: api=123456789012.dkr.ecr.ap-northeast-2.amazonaws.com/myshop-api
    argocd-image-updater.argoproj.io/api.update-strategy: semver
    argocd-image-updater.argoproj.io/write-back-method: git
    argocd-image-updater.argoproj.io/write-back-target: helmvalues:./charts/myshop-api/values-prod.yaml

ArgoCD Image Updater periodically polls ECR and, when it discovers a new tag, auto-commits to the manifest repo. The explicit commit step in GitHub Actions becomes unnecessary, but immediacy drops since the polling interval is 5 minutes. If you want the code push and manifest commit to appear in git history in a clear order, the GitHub Actions commit model is more intuitive.

Canary / Blue-Green — Argo Rollouts #

A standard Deployment’s RollingUpdate is the simplest zero-downtime deployment model. More sophisticated patterns — canary, blue-green, automated analysis followed by promotion — are handled by Argo Rollouts.

rollout.yaml — 5% canary → analysis → 100%
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: myshop-api
  namespace: myshop
spec:
  replicas: 10
  strategy:
    canary:
      canaryService: myshop-api-canary
      stableService: myshop-api-stable
      trafficRouting:
        alb:
          ingress: myshop-api
          servicePort: 80
      steps:
        - setWeight: 5
        - pause: { duration: 5m }
        - analysis:
            templates:
              - templateName: success-rate
        - setWeight: 25
        - pause: { duration: 10m }
        - setWeight: 50
        - pause: { duration: 10m }
        - setWeight: 100
  selector:
    matchLabels:
      app.kubernetes.io/name: myshop-api
  template:
    spec:
      containers:
        - name: api
          image: 123456789012.dkr.ecr.ap-northeast-2.amazonaws.com/myshop-api:1.5.0
          # ... (same spec as Deployment)

The new version receives 5% of traffic for 5 minutes → automated analysis (Prometheus metric query) → if it passes, traffic is gradually shifted in stages: 25% → 50% → 100%. If the analysis detects a failure at any stage, an automatic rollback is triggered.

analysistemplate.yaml — success rate analysis
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: success-rate
spec:
  metrics:
    - name: success-rate
      provider:
        prometheus:
          address: http://prometheus.monitoring.svc:9090
          query: |
            sum(rate(http_requests_total{app="myshop-api",status=~"2.."}[5m]))
              / sum(rate(http_requests_total{app="myshop-api"}[5m]))
      successCondition: result[0] >= 0.99
      failureLimit: 1

The Prometheus metrics covered in #5 feed directly into the canary’s auto-promote/rollback decision here. Rollouts shows its true value when paired with the observability stack covered in #5.

PR flow standard — environments + required reviewers #

The standard gate can be enforced in GitHub Actions as well.

.github/workflows/build.yml — using environment
jobs:
  build-prod:
    if: startsWith(github.ref, 'refs/tags/v')
    runs-on: ubuntu-latest
    environment:
      name: production
      url: https://api.myshop.example.com
    steps:
      - ...

Creating a production environment in GitHub Settings and specifying Required reviewers means workflows targeting that environment will not start without human approval — the standard pattern that prevents prod deploys from triggering automatically on a single tag push.

First cycle’s checks #

Items to check once the full GitHub Actions push → ECR → manifest commit → ArgoCD sync cycle has run end-to-end.

ECR image check
aws ecr describe-images \
  --repository-name myshop-api \
  --region ap-northeast-2 \
  --query 'imageDetails[*].[imageTags,imagePushedAt]' \
  --output table
ArgoCD Application status
argocd app get myshop-api-prod
argocd app sync myshop-api-prod   # manual sync (in case of prod)
argocd app history myshop-api-prod
Is the deployed image tag right
kubectl get deployment myshop-api -n myshop \
  -o jsonpath='{.spec.template.spec.containers[0].image}'

When all three commands consistently point to the new tag, the cycle is working correctly. The same information is displayed visually in the ArgoCD UI, and any drift between the manifest and the cluster is also visible at a glance.

One trap — container image tag mutability #

The operational standard is to treat image tags as immutable. Allowing the same tag to point to different images renders ArgoCD’s drift detection meaningless. The following setup is essential:

  • Enable immutable tags on ECR repositoryimage_tag_mutability = "IMMUTABLE" via Terraform
  • Never use the latest tag in prod — always git SHA or semver
  • Image tag = git commit hash or git tag — which commit is up in which environment is visible at a glance

Without this setup, you get incidents like “the tag that worked yesterday now points to a different image.” That is exactly the point at which GitOps’s source of truth breaks down.

Closing #

We turned the full flow of getting new myshop-api versions into the cluster into code. GitHub Actions pushes to ECR via OIDC trust without static keys, auto-commits Helm values to the manifest repo, and ArgoCD watches that change and syncs to the cluster. We also established the pattern where dev is auto-synced, prod requires the triple gate of PR review + GitHub environments + ArgoCD manual sync, and canary auto-promote/rollback is handled through Argo Rollouts. At this point, myshop-api has settled into a flow where a single code push auto-deploys to dev and a single git tag queues a prod deploy. The next post observes all this behavior by covering the observability stack — Prometheus + Grafana + Alertmanager + CloudWatch — along with the core alert rule set.

X