Contents
29 Chapter

Secret Operations

The third chapter of Part 5. Starting from the base64 limit of a K8s Secret and the meaning of etcd encryption-at-rest, it covers the secret lifecycle along the four axes of storage · rotation · injection · audit. It turns the comparison of sealed-secrets · external-secrets · SOPS, the zero-password operation combined with IRSA (IRSA for the AWS API, RDS IAM auth for the DB), the rotation difference of envFrom vs mount, separation per namespace with RBAC, and the audit viewpoint of the Audit log and GuardDuty into a practical operations manual.

This is the third chapter of Part 5 (Operations · Debugging · Cost). Secrets have appeared piecemeal across several chapters of this book (Chapter 6, ConfigMap · Secret, Chapter 14, RBAC / NetworkPolicy / ResourceQuota, Chapter 16, RBAC / ServiceAccount in Depth, Chapter 18, CRD and Operator, Chapter 20, GitOps, Chapter 23, DB Integration). This chapter gathers those fragments into one operations manual. It starts from the point after “don’t commit secret YAML to git unchanged” and covers the full production secret lifecycle.

The goal of this chapter is a state where the four axes of storage · rotation · injection · audit are organized into one operational model. We compare the differences among the three tools sealed-secrets / external-secrets / SOPS, cover the “zero passwords” operation combined with IRSA, and establish a baseline for secret governance.

The limit of a K8s Secret — base64 is not encryption #

The line pointed at in Chapter 6, ConfigMap · Secret §“The essential limit of a Secret” is this chapter’s starting point.

an ordinary Secret — the meaning of base64
apiVersion: v1
kind: Secret
metadata:
  name: myshop-api
type: Opaque
data:
  DATABASE_PASSWORD: cG9zdGdyZXNAcHJvZA==   # postgres@prod

The value of data is merely base64-encoded, not encrypted. Anyone can see the original with one base64 -d. If this manifest is committed to git, the secret is exposed externally unchanged.

The meaning and limit of etcd encryption-at-rest #

The K8s API Server stores objects in etcd. EKS by default has etcd’s disk encrypted with KMS, but the object itself inside etcd is plaintext. That means someone with access to a node’s etcd data can see the Secret’s value.

What prevents this is encryption-at-rest.

terraform — enabling EKS Secret encryption
module "eks" {
  # ...
  encryption_config = [{
    provider_key_arn = aws_kms_key.eks.arn
    resources        = ["secrets"]
  }]
}

When this setting is on, the Secret object itself is encrypted with a KMS key before being stored in etcd. It’s the default setup of a production cluster. However, even with this on, the git-commit problem of the manifest isn’t solved — it’s protection inside etcd, not protection at the manifest stage.

Keeping these two separate is the first mental model of secret operations.

LocationProtection tool
the manifest inside the git reposealed-secrets / external-secrets / SOPS
etcd inside the clusterencryption-at-rest (KMS)
the environment variables / files inside the PodPod isolation, RBAC, audit

The four axes of secret operations #

The production secret lifecycle breaks down into four axes.

AxisQuestion
Storagewhere is the real secret value?
Rotationhow is the password renewed?
Injectionhow does the Pod receive that value?
Auditwho accessed that value, and when?

Most secret incidents are not a defect of one axis but the combined result of leaks across several axes. Introducing only one tool resolves one or two axes, but if another axis is weak, security breaks in the end. Treating the four axes as one system is the starting point of operations.

The injection pattern — envFrom vs mount #

There are two standard patterns for injecting a secret into a Pod.

envFrom — inject as environment variables
spec:
  containers:
    - name: api
      envFrom:
        - secretRef:
            name: myshop-api-db
volumeMount — inject as a file
spec:
  containers:
    - name: api
      volumeMounts:
        - name: db-secret
          mountPath: /var/secrets/db
          readOnly: true
  volumes:
    - name: db-secret
      secret:
        secretName: myshop-api-db

The decisive difference between the two patterns is behavior on rotation.

GrainenvFromvolumeMount
on Secret renewalthe environment variables inside the Pod stay (fixed only at Pod start)the file is auto-renewed within about 1 minute
rotation handlingneeds a Pod restartthe application just needs to re-read the file
debuggingcheck instantly with the env commandcheck the file path + cat

The trap of “needing kubectl rollout restart on Secret renewal” pointed at in Chapter 23, DB Integration is specific to envFrom. For secrets that rotate frequently, volumeMount is operationally more natural. If you make the application code re-read the file periodically, the new secret applies without a Pod restart.

That said, the environment-variable model is simpler, and most 12-factor apps expect environment variables. The standard is to choose between the two based on rotation frequency and the application’s needs.

sealed-secrets — commit safely to git #

Bitnami’s sealed-secrets is a tool that seals a secret into a form safe to commit to git.

installing sealed-secrets
helm repo add sealed-secrets https://bitnami-labs.github.io/sealed-secrets
helm install sealed-secrets sealed-secrets/sealed-secrets \
  -n kube-system

After installation the controller generates an RSA key pair inside the cluster. The user seals the secret with the public key, and the controller decrypts it with the private key.

sealing a secret
echo -n "postgres@prod" | kubectl create secret generic myshop-db \
  --dry-run=client --from-file=password=/dev/stdin -o yaml \
  | kubeseal --controller-namespace kube-system \
             --controller-name sealed-secrets \
             --format yaml \
  > sealedsecret.yaml
sealedsecret.yaml — committable to git
apiVersion: bitnami.com/v1alpha1
kind: SealedSecret
metadata:
  name: myshop-db
spec:
  encryptedData:
    password: AgB7K8x...  # only the cluster's controller can decrypt
  template:
    type: Opaque

Commit this manifest to git and sync it to the cluster with ArgoCD, and the controller automatically decrypts it to make an ordinary Secret. The first of the three options pointed at in Chapter 20, GitOps §“The single-source model for secrets” is the tool of this section.

The trade-offs of sealed-secrets #

  • Pros — no external dependency. It ends with one controller inside the cluster.
  • Cons — the controller’s private key is inside the cluster, so a cluster backup is the key backup. Since the keys of dev / staging / prod differ, SealedSecret manifests are not compatible across environments.
  • Rotation — on secret rotation you have to re-seal and commit to git — a manual step.

In a small-team + single-cluster environment it’s the simplest option. When you need to share manifests across environments / automatic rotation, the next tool is suitable.

external-secrets — sync with an external secret store #

It’s the Operator pattern of Chapter 18, CRD and Operator, and the tool covered in Chapter 23, DB Integration. It auto-syncs secrets from AWS Secrets Manager / HashiCorp Vault / GCP Secret Manager into K8s Secrets.

Comparing the core difference with sealed-secrets gives the following.

Grainsealed-secretsexternal-secrets
the secret’s source of truthgit repoexternal secret store (AWS SM, etc.)
the manifest’s contentthe sealed value (ciphertext)a reference to the secret (path / key)
rotationmanual (re-seal + git push)automatic (renewed in the external store, ESO auto-syncs)
manifests across environmentsdiffer per environment (different keys)identical per environment (only the reference differs)
external dependencynonethe cost + availability of AWS SM / Vault

For secrets that rotate frequently in an operational environment (DB passwords, API keys, etc.), external-secrets is natural. The ESO manifest itself is just as covered in Chapter 23 — the two CRDs ClusterSecretStore + ExternalSecret.

ESO’s rotation — the automatic path #

ExternalSecret — automating rotation
apiVersion: external-secrets.io/v1
kind: ExternalSecret
metadata:
  name: myshop-api-db
spec:
  refreshInterval: 1h   # check the external store every hour
  # ...

refreshInterval is the key — ESO periodically polls the external store to detect changes and renew the K8s Secret. Combined with AWS Secrets Manager’s automatic rotation (Lambda-based), password rotation becomes fully automated.

automatic rotation flow
1. AWS Secrets Manager calls a Lambda (every 30 days, etc.)
2. the Lambda applies a new password to RDS + updates Secrets Manager
3. ESO detects the change within 1 hour, renews the K8s Secret
4. Reloader (a separate component) detects the Secret change, rollout-restarts the Pod
5. the myshop-api Pod connects to RDS with the new password

Once this cycle runs, quarterly rotation is automated without a human ever touching the password. It’s one of the goals of operational secrets.

SOPS — the simple option for a small team #

Mozilla’s SOPS (Secrets OPerationS) is a tool with a different shape from sealed-secrets / ESO. It encrypts a file locally and commits that file to git.

SOPS + age — the simplest setup
# generate an age key pair (once)
age-keygen -o ~/.config/sops/age/keys.txt

# write an ordinary secret YAML
cat > secret.yaml <<EOF
apiVersion: v1
kind: Secret
metadata:
  name: myshop-db
stringData:
  password: postgres@prod
EOF

# encrypt
sops --age $(cat ~/.config/sops/age/keys.txt | grep public | cut -d: -f2) \
     --encrypt --in-place secret.yaml

The encrypted file shows no plaintext without the key. It’s safe to commit to git. At apply time, SOPS decrypts it back into an ordinary manifest and then runs kubectl apply.

To combine with ArgoCD you need an auxiliary tool like helm-secrets or argocd-vault-plugin. Combining with AWS KMS lets you delegate key management to AWS, reducing the operational burden.

Where SOPS sits #

  • Pros — the simplest. One file = one secret bundle, so the mental model is intuitive.
  • Cons — no automatic rotation, and key management across environments takes effort. As secrets grow, file management becomes cumbersome.

It’s natural for a small team’s single environment + fewer than 10 secrets. As secrets grow or rotation automation becomes necessary, moving to ESO is the natural flow.

The decision tree for the three tools #

choosing a secret tool
- want to finish inside the cluster without using an external secret store
  -> sealed-secrets

- already have an external store like AWS / GCP / Vault
  + automatic rotation is an operational requirement
  -> external-secrets (ESO)

- small team + single environment + few secrets
  + like the simplicity of one file = one secret
  -> SOPS

- a combination of the above three (SOPS for infra secrets, ESO for app secrets, etc.)
  -> possible, but the operational burden increases

This book’s standard path is exactly as covered in Chapters 21 ~ 26AWS Secrets Manager + External Secrets Operator. It’s the operational standard for an EKS environment, and the automatic RDS secret sync of Chapter 23 is the real application of this model.

The “zero passwords” operation combined with IRSA #

The most advanced pattern is eliminating the password itself. The IRSA covered in Chapter 16, RBAC / ServiceAccount in Depth is the foundation of this pattern.

The combination of two grains #

A production workload’s external credentials roughly split into two.

the two combinations of zero passwords
[AWS API calls — S3, Secrets Manager, CloudWatch]
   -> IRSA + projected token + STS AssumeRoleWithWebIdentity
   -> no static key, the token auto-rotates every hour

[DB connection — RDS PostgreSQL / MySQL]
   -> RDS IAM auth + a 15-minute IAM token
   -> no DB password, authentication by IAM permission

Combining the two pieces creates an operational model where there is no permanent secret anywhere in myshop-api. You don’t need to worry about password rotation, and all access is recorded in CloudTrail.

Applying RDS IAM auth #

Python — connect to RDS with an IRSA token
import os
import boto3
import psycopg2

def get_db_connection():
    rds_client = boto3.client("rds")
    token = rds_client.generate_db_auth_token(
        DBHostname=os.environ["DB_HOST"],
        Port=5432,
        DBUsername=os.environ["DB_USER"],
        Region="ap-northeast-2",
    )

    return psycopg2.connect(
        host=os.environ["DB_HOST"],
        port=5432,
        user=os.environ["DB_USER"],
        password=token,           # not a password but an IAM token
        dbname=os.environ["DB_NAME"],
        sslmode="require",
    )

Here boto3.client("rds") is automatically authenticated with IRSA’s projected token, and generate_db_auth_token makes a 15-minute token. Neither a K8s Secret nor an AWS Secrets Manager secret is needed.

The limits of “zero passwords” #

  • Token expiry — it must be renewed every 15 minutes, so combining with a long-lived connection is tricky. If you put a pooler in between, the pooler itself has to receive the token.
  • PostgreSQL user setup needed — you have to add the user to the rds_iam group and set up grants.
  • Not every DB supports it — Aurora MySQL / PostgreSQL support it; some old RDS engines don’t.
  • Hard to use together with PgBouncer transaction pooling — the issue pointed at in Chapter 23 §“The trap of transaction pooling.”

Because of these limits, this book’s standard path is the combination of traditional passwords + Secrets Manager + ESO + IRSA. “Zero passwords” is an option to apply to some workloads in an environment with stricter security. You have to evaluate the balance of operational burden and security strength as a whole.

Combining with RBAC — separation per namespace #

Not only the operation of the secret itself but also who can read that secret is at the core of secret security. The RBAC model of Chapter 14, RBAC / NetworkPolicy / ResourceQuota is the key of this section.

permission to read secrets per namespace
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: secret-reader
  namespace: myshop
rules:
  - apiGroups: [""]
    resources: ["secrets"]
    verbs: ["get", "list", "watch"]

If you grant this Role only to a ServiceAccount in the myshop namespace, workloads in other namespaces cannot read myshop’s secrets. It’s the basic setup for per-team isolation inside one cluster.

Disabling the ServiceAccount token #

The pattern of Chapter 14 §“Unmounting the automatic mount of the ServiceAccount token” is the last safety line of the security model.

unmount the token automount
apiVersion: v1
kind: ServiceAccount
metadata:
  name: myshop-api
  namespace: myshop
automountServiceAccountToken: false

With this one line in place, even if a Pod is compromised, there’s no token to access the K8s API directly. Letting only workloads that need IRSA explicitly receive a token, and turning it off for everything else, is a regular recommendation in security guides.

Key separation across dev / staging / prod #

For sealed-secrets, the controller’s keys must be separate per environment to be natural, and for ESO, the trust policy of the per-environment IRSA Role must be separate. If prod’s key is decryptable by the dev controller, it’s a big security hole.

The per-environment separation of Chapter 20, GitOps combines naturally with this section’s key separation. Environment separation must be consistently established not only at the manifest level but also at the secret-key level.

Audit — who accessed what, and when #

Audit is essential for the post-incident analysis of a secret incident. There are three categories of tools.

K8s Audit log #

terraform — enabling EKS audit log
module "eks" {
  # ...
  cluster_enabled_log_types = ["api", "audit", "authenticator"]
}

When this setting is on, every API request is recorded in the audit log group of CloudWatch Logs. “Which ServiceAccount read the myshop-api-db Secret, and when” becomes traceable.

audit log query — CloudWatch Insights
fields @timestamp, user.username, verb, objectRef.resource, objectRef.name
| filter objectRef.resource = "secrets"
| filter verb in ["get", "list"]
| sort @timestamp desc
| limit 100

The standard is to run this query once a quarter to check whether there are any abnormal access patterns. It’s good to add to the quarterly security checkup items of Chapter 26, The Operations Checklist.

AWS CloudTrail — Secrets Manager access #

Secrets Manager access history
aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=EventName,AttributeValue=GetSecretValue \
  --max-results 50

Every call to AWS Secrets Manager is recorded in CloudTrail. You can see who, with which IAM Role, accessed which secret. It’s a tool to verify whether the ESO IRSA Role of Chapter 23 is used consistently.

GuardDuty / Kubescape — anomaly detection #

When GuardDuty’s EKS Protection is on, it automatically detects abnormal secret access patterns (e.g., a newly created ServiceAccount suddenly reading many secrets). Kubescape catches manifest-stage security policy violations (committing a Secret in plaintext, failing to disable the token automount, etc.) at the CI stage.

The secret governance checklist #

We organize a one-page checklist for the quarterly checkup.

quarterly secret governance checkup
[Storage]
- whether EKS encryption-at-rest (KMS) is enabled
- whether there is no plaintext secret in the manifests of the git repo — gitleaks / trufflehog scan
- consistency of the chosen tool among sealed-secrets / ESO / SOPS

[Rotation]
- the list of secrets not rotated for 90+ days
- whether AWS Secrets Manager automatic rotation is enabled (RDS, API keys, etc.)
- whether the rotation-failure alert works

[Injection]
- whether the choice of envFrom vs volumeMount matches the rotation frequency
- integration with Reloader — automating the Pod restart on Secret renewal
- the list of workloads operable at zero passwords with IRSA + RDS IAM auth

[Audit]
- EKS audit log enabled + quarterly CloudWatch Insights checkup
- the Secrets Manager GetSecretValue checkup in CloudTrail
- the alert-handling state of GuardDuty / Kubescape
- the application rate of automountServiceAccountToken: false

The goal of secret operations is for this checklist to fit on one page and be filled in regularly each quarter. The regular operations calendar of Chapter 26 and this chapter’s checklist together support the security posture of a production cluster.

Exercises #

  1. Install both sealed-secrets and external-secrets on your dev cluster, and operate the same secret (e.g., a dummy DB password) with each tool separately. Follow a secret-rotation scenario (changing the value) along both paths and compare how many steps sealed-secrets needs versus how many steps ESO needs. Organize the shape of the manifest’s git diff, the change in the ArgoCD UI, and whether a Pod restart occurs into one table.
  2. Pick one workload of myshop-api and move it to “zero passwords.” Add one database user of RDS to the rds_iam group, and apply code that connects with an IAM token like this chapter’s Python example. In one paragraph, organize the decision rationale, tailored to your scenario, for how to solve the token-expiry problem that arises when combining with PgBouncer — whether to bypass the pooler, or to have the pooler itself receive the token.
  3. Enable the EKS audit log and run this chapter’s secret-access query in CloudWatch Insights. Classify the results over a week into normal / abnormal patterns, and look for items suspected to be abnormal (an unexpected ServiceAccount accessing myshop’s Secret, a large volume of GetSecretValue at dawn, etc.). Write one manifest that auto-detects the discovered pattern with a PrometheusRule of Chapter 25, Monitoring · Alerts or a GuardDuty rule.

In one line: A K8s Secret’s base64 is not encryption, and etcd encryption-at-rest protects only the data inside the cluster — the manifest stage is separate. Secret operations are the four axes of storage · rotation · injection · audit, and the tools split into three groups: sealed-secrets (ends inside git) / external-secrets (external-store sync + automatic rotation) / SOPS (the simple choice for a small team). envFrom is simple but needs a Pod restart on rotation, while volumeMount auto-renews the file. The “zero passwords” model of IRSA + RDS IAM auth is the most advanced, but because of the token-expiry and PgBouncer combination it fits only some workloads. RBAC + per-environment key separation + automountServiceAccountToken: false are the last security safety line, and the EKS audit log + CloudTrail + GuardDuty are the tools of the audit layer. The goal of operations is for the quarterly secret governance checklist to fit on one page.

Next chapter #

If this chapter dealt with the shape of secrets, the next chapter is the shape of time. K8s ships a minor version every quarter, and EKS standard support lasts 14 months. At least one minor upgrade a year is an essential operational cycle, and the manual for running that cycle safely is the body of the next chapter.

Chapter 30, Upgrade Strategy covers the EKS upgrade flow briefly pointed at in Chapter 26, The Operations Checklist. It covers the order control plane → data plane → add-ons, deprecated API detection (pluto · kubent · the apiserver_requested_deprecated_apis metric), the safety devices of node drain (PDB · terminationGracePeriodSeconds), minimizing the blast radius, rollback scenarios, and the checklist for the week before / the day of / the week after the upgrade.

X