Contents
20 Chapter

GitOps

We cover the operational model where the source of truth for manifests sits in git and a controller inside the cluster watches git to sync automatically. We wrap up Part 3, organizing the difference between the push model and the pull model, the four principles of GitOps, ArgoCD's Application CRD · App of Apps · Sync Wave, Flux's GitRepository · Kustomization · HelmRelease, directory-structure patterns, and the three standard tools for putting secrets in git.

This is the last chapter of Part 3. From Chapter 15 CNI in Depth through Chapter 19 Observability, we’ve stacked, layer by layer, the cluster’s data plane · permissions · policy · extension · observation. This chapter covers the very way all those manifests enter the cluster — GitOps. Instead of a person running kubectl apply by hand, it’s an operational model that puts the source of truth for manifests in git and has a controller inside the cluster watch git to sync automatically. ArgoCD and Flux are the two standard implementations of this model, and as Part 3’s final chapter we organize the two tools’ model · operational patterns · a Part 3 retrospective · a Part 4 preview.

By the end of this chapter you’ll have the shape where changes to a production cluster flow automatically through a git PR rather than through a person’s hands. The “a declarative manifest is always the source of truth” we touched on in §“Adjusting replicas” of Chapter 4 Deployment and ReplicaSet is the stage where it becomes a full-fledged operational model.

The push model and the pull model #

First let’s compare it with the model that was standard before GitOps appeared. The way a CI / CD pipeline applies a manifest to the cluster broadly splits into two branches.

ModelFlow
PushThe CI pipeline runs kubectl apply directly against the cluster’s API server. The CI system holds the cluster credentials
Pull (GitOps)A controller inside the cluster watches git. When a manifest changes, the controller syncs automatically

The traditional CD pipeline was the push model. GitHub Actions or Jenkins ran kubectl apply -f manifests/ after a build and that was it. The problems with this model are three.

  • The CI system holds strong credentials to the cluster — if the CI system is compromised, so is the cluster.
  • Drift is invisible — if someone runs kubectl edit directly on the cluster, git’s manifest and the actual cluster diverge, but there’s no standard mechanism to detect that divergence.
  • It’s hard to scale to several clusters — to apply the same manifest to N clusters you have to run kubectl apply N times.

The GitOps model solves these three at once. Because a controller inside the cluster watches git, there’s no need to hold cluster credentials externally; because that controller keeps looking at the sync state, it automatically detects drift; and because it’s a model where each cluster watches its own git, scaling to N clusters is natural.

The four principles of GitOps #

The four principles of GitOps organized by the OpenGitOps project are as follows.

PrincipleMeaning
Declarativethe system’s desired state is expressed declaratively
Versioned and Immutablethe desired state is kept in an immutable store like git
Pulled Automaticallyapproved changes are automatically applied to the system
Continuously Reconcileda controller continuously closes the gap between desired state and actual state

K8s manifests are declarative, and git is versioned + immutable. ArgoCD and Flux add pull-based reconciliation on top to complete GitOps. It’s the model where the reconcile loop we saw in Chapter 1 What Kubernetes Is extends beyond the controller layer inside the cluster to the sync between cluster and git.

ArgoCD — the Application CRD-centered model #

ArgoCD is a GitOps tool that Intuit built and donated to the CNCF. Its biggest characteristic is a rich web UI. You can see the sync state, drift, and manifest change history of every Application in the cluster on one screen, so the barrier to entry for an ops team is low.

Application CRD — ArgoCD’s unit #

The unit by which ArgoCD fetches a manifest from git and syncs it to the cluster is the Application CRD. It’s an example where the model of Chapter 18 the CRD and Operator pattern applies directly to a GitOps tool too.

application-my-app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: my-app
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/myorg/manifests.git
    targetRevision: main
    path: apps/my-app/overlays/prod
  destination:
    server: https://kubernetes.default.svc
    namespace: my-app
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
      - CreateNamespace=true

When this manifest is applied to ArgoCD, the following happens automatically.

  1. The ArgoCD controller fetches the path directory of repoURL from git
  2. It auto-recognizes Kustomize / Helm / plain YAML and renders the manifest
  3. It syncs to the cluster’s destination (because automated.selfHeal: true, it auto-repairs drift)
  4. When git’s manifest changes, it auto-re-syncs (automated)
  5. Objects gone from git are deleted from the cluster too (prune: true)

App of Apps — managing a bundle of Applications #

The pattern of managing several Applications in one place is App of Apps. It’s a structure where one Application’s source points to a directory containing other Application manifests.

App of Apps directory structure
manifests/
  apps/
    root.yaml              ← root Application (applied to ArgoCD manually)
    children/
      app-a.yaml           ← Application: app-a
      app-b.yaml           ← Application: app-b
      app-c.yaml           ← Application: app-c
  ...

Register just one root Application with ArgoCD first, and inside it the child Applications are created in turn, and each child Application in turn syncs its own manifest. To add a new app to the cluster, you just add one new child Application to git and you’re done.

Sync Wave — ordered application #

There are cases where manifest application needs an order — the order of creating a Namespace first, then a ConfigMap inside it, then bringing up a Deployment. ArgoCD expresses this order with an annotation.

Expressing order — argocd.argoproj.io/sync-wave
metadata:
  annotations:
    argocd.argoproj.io/sync-wave: "0"   # Namespace
---
metadata:
  annotations:
    argocd.argoproj.io/sync-wave: "1"   # ConfigMap, Secret
---
metadata:
  annotations:
    argocd.argoproj.io/sync-wave: "2"   # Deployment, StatefulSet

A lower wave is applied first, and the next wave proceeds only after all of that wave’s objects reach a healthy state. The most often-met use is the pattern of creating a CRD first and then applying instances of that CRD.

Flux — a bundle of small components #

Flux is a GitOps tool built by Weaveworks, in the same category as ArgoCD but with a different approach. Flux v2 is designed not as one big component but as a bundle of several small controllers.

Flux controllerRole
source-controllerfetches manifests from git / Helm repos / OCI images
kustomize-controllerapplies Kustomize manifests
helm-controllerapplies Helm charts via the HelmRelease object
notification-controllernotifies events to Slack / Teams / GitHub, etc.
image-automation-controllerauto-commits new container-image versions to git

Each controller has its own CRD, and all behavior is expressed with that CRD’s manifest.

GitRepository + Kustomization — Flux’s basic bundle #

Registering a git repository
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
  name: manifests
  namespace: flux-system
spec:
  interval: 1m
  url: https://github.com/myorg/manifests.git
  ref:
    branch: main
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: my-app
  namespace: flux-system
spec:
  interval: 5m
  path: ./apps/my-app/overlays/prod
  prune: true
  sourceRef:
    kind: GitRepository
    name: manifests
  targetNamespace: my-app

GitRepository watches git, and Kustomization applies one directory of that git to the cluster. Thanks to the separation of the two objects, several Kustomizations can point at the same git with different paths.

HelmRelease — GitOps-ifying a Helm chart #

HelmRelease — express a Helm chart as a manifest too
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
  name: prometheus
  namespace: monitoring
spec:
  interval: 10m
  chart:
    spec:
      chart: kube-prometheus-stack
      version: "55.x"
      sourceRef:
        kind: HelmRepository
        name: prometheus-community
  values:
    prometheus:
      prometheusSpec:
        retention: 30d

Instead of running helm install by hand, express it with a HelmRelease manifest and the install · upgrade of a Helm chart comes into the GitOps flow too. The kube-prometheus-stack of Chapter 19 can be tied into GitOps in the same shape.

ArgoCD vs Flux — the shape of the choice #

DimensionArgoCDFlux
Modelone big component + rich UIa bundle of small controllers + CLI-centric
Barrier to entrylow (start with the UI)medium (start with CRD manifests)
Multi-tenancyexpressed with AppProjectseparation per namespace
Multi-clusterone ArgoCD can manage several clustersone Flux per cluster (hub-spoke possible)
Helm supportfirst-classfirst-class via the HelmRelease CRD
Image auto-updateargocd-image-updater (separate)image-automation-controller (built in)

The choice usually follows this.

  • If the ops team prefers a GUI and wants to see all clusters on one screen, ArgoCD is natural.
  • If you want to express all operations as manifests and prefer a bundle of small components, Flux fits well.

Both tools are CNCF graduated projects and are set at operational scale. It isn’t the kind of decision where picking the wrong one causes a big incident. Chapter 24 CI / CD Pipeline covers the combination of GitHub Actions and ArgoCD — the flow where the built image tag comes in via a git PR and ArgoCD auto-syncs.

Directory-structure patterns #

A GitOps repo’s directory structure splits by the ops team’s style, but there are two patterns you often see.

1. env-per-folder — branching by environment #

Environment at the top level
manifests/
  base/
    my-app/
      deployment.yaml
      service.yaml
  envs/
    dev/
      kustomization.yaml      ← base + dev patch
    staging/
      kustomization.yaml
    prod/
      kustomization.yaml

A structure that uses Kustomize’s base + overlay pattern unchanged. Per-environment differences (replicas, image tag, resources) go into the overlay as a patch. The one-set-of-manifests-per-environment pattern we touched on in Chapter 7 Namespace and Labels runs on top of this directory structure.

2. app-per-folder + branch-per-env #

App at the top level, environment in a branch
manifests/        (main branch = prod)
  apps/
    my-app/
      deployment.yaml
      service.yaml
    other-app/
      ...

manifests-dev/    (dev branch)
manifests-staging/ (staging branch)

A model that uses branches as the environment split. The environment differences are expressed as commits so auditing is natural, but the sync burden between branches is large.

In operations, env-per-folder is used more often. Because the flow of changes happens inside one branch (main), PR review is simple.

How to put a Secret in git #

One of GitOps’s big homework problems is the path to putting secrets in git. You can’t put a K8s Secret in git in plaintext. There are three standard tools.

ToolModel
Sealed SecretsA tool built by Bitnami. Encrypts a secret as a SealedSecret and puts it in git. Only the controller inside the cluster can decrypt with its own key
External Secrets OperatorPuts only a reference to an external secret store (AWS Secrets Manager, Vault, etc.) in git, and the controller syncs it into a K8s Secret
SOPS + age / PGPPuts the encrypted YAML directly in git. Both ArgoCD / Flux support SOPS integration

External Secrets Operator is the most often-used path. The secret’s source of truth is in the external store, and only the reference comes into K8s, so secret rotation finishes in one stroke at the external store. Combined with the IRSA of Chapter 16 RBAC / ServiceAccount in Depth, you can keep even the secret-store access credentials from sitting statically inside the cluster. The full operational comparison of the three tools is covered in Chapter 29 Secret Operations.

Principles to pin down in operations #

1. auto-sync vs manual sync — branching by environment #

Keep syncPolicy.automated on and a change in git is reflected to the cluster immediately. It’s common to branch dev / staging to automatic and prod to manual (or automatic after a PR merge). When you put auto-sync on prod, you set safeguards alongside, like syncOptions: PruneLast=true to handle deletion last.

2. The meaning of drift detection #

The GitOps controller continuously compares the difference between git and the actual cluster. If someone modifies directly with kubectl edit, that change is immediately detected as drift, and if selfHeal: true it’s overwritten again with git’s manifest. This is GitOps’s strength, but at the same time it’s a pitfall — fields the controller auto-generates (status, auto labels) must not look like drift. Both ArgoCD / Flux let you write the fields to ignore with the ignoreDifferences setting. §“status subresource” of Chapter 18 is the pattern of separating this drift problem in advance at the CRD level.

3. The impact of changing a Helm value #

Change a HelmRelease’s values and every object that chart created is redeployed. To keep an unintended redeploy from happening, it’s good to check the blast radius of a values change in advance with a dry-run at the PR stage. The gradual dry-run adoption pattern of Chapter 17 Admission Controller ties directly into GitOps syncing — both the policy dry-run and the GitOps dry-run are safeguards in the same direction.

4. The hub-spoke model for multi-cluster #

There are two standard models when managing N clusters with GitOps.

  • Each cluster has its own GitOps controller — cluster-self-contained, few external dependencies
  • The GitOps controller of one hub cluster manages the spoke clusters — operationally simpler, the hub’s availability is critical

ArgoCD fits both models well, and for Flux the first model is natural.

Part 3 retrospective — what came into hand over six chapters of depth #

Since it’s Part 3’s last chapter, let’s pin it down once. Part 1 added the model of a single manifest, Part 2 added the depth of that manifest running in a production cluster, and Part 3 added, layer by layer, the depth of the policy engine · extension · observation · syncing on top.

  • Chapter 15CNI in Depth. The four conditions of the K8s network model, the CNI interface, the iptables / IPVS / eBPF data plane, a comparison of Calico and Cilium.
  • Chapter 16RBAC / ServiceAccount in Depth. Aggregated ClusterRole, Impersonation, the projected token, connecting a K8s ServiceAccount to cloud IAM with IRSA / Workload Identity.
  • Chapter 17Admission Controller. The API server’s five-stage flow, the mutating / validating webhook, a comparison of the policy engines OPA Gatekeeper and Kyverno.
  • Chapter 18The CRD and Operator pattern. The path to extending the K8s API itself, the skeleton of a controller-runtime-based Operator, ownerReference / finalizer / status subresource.
  • Chapter 19Observability. The three axes of metrics / logs / traces, Prometheus + kube-state-metrics, Loki, OpenTelemetry, Grafana, Alertmanager.
  • Chapter 20 (this chapter)GitOps. The operational model of putting the source of truth for manifests in git, ArgoCD and Flux, directory structure and secret management.

If you’ve followed all six of these chapters, you have the field of vision needed to adopt and operate K8s — the stage of deciding which CNI to adopt, which policy engine to adopt, which observability stack to choose, and how to build the GitOps pipeline.

Exercises #

  1. Install ArgoCD on a local cluster and create one Application pointing at your manifest repository. Leave automated.selfHeal: true, then modify a cluster object directly with kubectl edit. Record in chronological order the flow where ArgoCD detects drift and auto-repairs with git’s manifest, and organize it into one paragraph with the model of §“The meaning of drift detection.”
  2. Reorganize your manifest repository into an env-per-folder structure. Split it into base/, envs/dev/, envs/staging/, envs/prod/, and organize how dev’s and prod’s replicas · image tag are expressed as overlays. Note in one paragraph how the namespace separation of Chapter 7 pairs with this directory structure.
  3. Compare the models of the three tools Sealed Secrets, External Secrets Operator, and SOPS with the table of §“How to put a Secret in git,” then decide in one paragraph which tool to pick against your cluster environment (EKS / GKE / on-prem, whether you use an external secret store). Also reason which tool fits most smoothly into “zero passwords” operations when combined with the IRSA / Workload Identity of Chapter 16.

In one line: GitOps is an operational model that puts the source of truth for manifests in git and has a controller inside the cluster watch git to pull and reconcile. ArgoCD’s strength is a rich UI and the Application CRD · App of Apps · Sync Wave; Flux’s strength is a bundle of small controllers (source · kustomize · helm · notification · image-automation) and being CLI-centric. For secrets, pick among the three tools Sealed Secrets · External Secrets · SOPS to fit the environment, and combined with IRSA you can operate secret-store access without static credentials. The four operational guardrails are branching auto-sync by environment · the ignoreDifferences of drift detection · checking the impact of a Helm value change · deciding multi-cluster hub-spoke.

Next chapter #

Part 3 is over. If Parts 1 ~ 3 were the path to understanding K8s’s manifests and their operational model at the manifest level, Part 4 is the flow of putting a real service on top and operating it. We follow, from start to finish, the flow of setting up a cluster on EKS from scratch, framing the app deployment skeleton, connecting a DB, building the CI / CD pipeline, and hanging monitoring · alerts.

The storyline of all of Part 4 is as follows.

ChapterSubject
Chapter 21EKS Cluster Setup — Terraform · eksctl · IRSA · add-ons
Chapter 22App Deployment Skeleton — Deployment · Service · Ingress · Helm
Chapter 23DB Integration — RDS · Secrets Manager · External Secrets · connection pool
Chapter 24CI / CD Pipeline — GitHub Actions · ECR · ArgoCD
Chapter 25Monitoring · Alerts — Prometheus · CloudWatch · Alertmanager
Chapter 26Operations Checklist — upgrades · backup · recovery · cost · security

From Chapter 21 EKS Cluster Setup on, it’s not abstraction but a real adoption case. We follow the flow of building the VPC · IAM · EKS cluster from scratch with Terraform, framing the node groups and add-ons, and installing IRSA and the ALB Controller together.

X