Secret Operations
The third chapter of Part 5. Starting from the base64 limit of a K8s Secret and the meaning of etcd encryption-at-rest, it covers the secret lifecycle along the four axes of storage · rotation · injection · audit. It turns the comparison of sealed-secrets · external-secrets · SOPS, the zero-password operation combined with IRSA (IRSA for the AWS API, RDS IAM auth for the DB), the rotation difference of envFrom vs mount, separation per namespace with RBAC, and the audit viewpoint of the Audit log and GuardDuty into a practical operations manual.
This is the third chapter of Part 5 (Operations · Debugging · Cost). Secrets have appeared piecemeal across several chapters of this book (Chapter 6, ConfigMap · Secret, Chapter 14, RBAC / NetworkPolicy / ResourceQuota, Chapter 16, RBAC / ServiceAccount in Depth, Chapter 18, CRD and Operator, Chapter 20, GitOps, Chapter 23, DB Integration). This chapter gathers those fragments into one operations manual. It starts from the point after “don’t commit secret YAML to git unchanged” and covers the full production secret lifecycle.
The goal of this chapter is a state where the four axes of storage · rotation · injection · audit are organized into one operational model. We compare the differences among the three tools sealed-secrets / external-secrets / SOPS, cover the “zero passwords” operation combined with IRSA, and establish a baseline for secret governance.
The limit of a K8s Secret — base64 is not encryption #
The line pointed at in Chapter 6, ConfigMap · Secret §“The essential limit of a Secret” is this chapter’s starting point.
apiVersion: v1
kind: Secret
metadata:
name: myshop-api
type: Opaque
data:
DATABASE_PASSWORD: cG9zdGdyZXNAcHJvZA== # postgres@prodThe value of data is merely base64-encoded, not encrypted. Anyone can see the original with one base64 -d. If this manifest is committed to git, the secret is exposed externally unchanged.
The meaning and limit of etcd encryption-at-rest #
The K8s API Server stores objects in etcd. EKS by default has etcd’s disk encrypted with KMS, but the object itself inside etcd is plaintext. That means someone with access to a node’s etcd data can see the Secret’s value.
What prevents this is encryption-at-rest.
module "eks" {
# ...
encryption_config = [{
provider_key_arn = aws_kms_key.eks.arn
resources = ["secrets"]
}]
}When this setting is on, the Secret object itself is encrypted with a KMS key before being stored in etcd. It’s the default setup of a production cluster. However, even with this on, the git-commit problem of the manifest isn’t solved — it’s protection inside etcd, not protection at the manifest stage.
Keeping these two separate is the first mental model of secret operations.
| Location | Protection tool |
|---|---|
| the manifest inside the git repo | sealed-secrets / external-secrets / SOPS |
| etcd inside the cluster | encryption-at-rest (KMS) |
| the environment variables / files inside the Pod | Pod isolation, RBAC, audit |
The four axes of secret operations #
The production secret lifecycle breaks down into four axes.
| Axis | Question |
|---|---|
| Storage | where is the real secret value? |
| Rotation | how is the password renewed? |
| Injection | how does the Pod receive that value? |
| Audit | who accessed that value, and when? |
Most secret incidents are not a defect of one axis but the combined result of leaks across several axes. Introducing only one tool resolves one or two axes, but if another axis is weak, security breaks in the end. Treating the four axes as one system is the starting point of operations.
The injection pattern — envFrom vs mount #
There are two standard patterns for injecting a secret into a Pod.
spec:
containers:
- name: api
envFrom:
- secretRef:
name: myshop-api-dbspec:
containers:
- name: api
volumeMounts:
- name: db-secret
mountPath: /var/secrets/db
readOnly: true
volumes:
- name: db-secret
secret:
secretName: myshop-api-dbThe decisive difference between the two patterns is behavior on rotation.
| Grain | envFrom | volumeMount |
|---|---|---|
| on Secret renewal | the environment variables inside the Pod stay (fixed only at Pod start) | the file is auto-renewed within about 1 minute |
| rotation handling | needs a Pod restart | the application just needs to re-read the file |
| debugging | check instantly with the env command | check the file path + cat |
The trap of “needing kubectl rollout restart on Secret renewal” pointed at in Chapter 23, DB Integration is specific to envFrom. For secrets that rotate frequently, volumeMount is operationally more natural. If you make the application code re-read the file periodically, the new secret applies without a Pod restart.
That said, the environment-variable model is simpler, and most 12-factor apps expect environment variables. The standard is to choose between the two based on rotation frequency and the application’s needs.
sealed-secrets — commit safely to git #
Bitnami’s sealed-secrets is a tool that seals a secret into a form safe to commit to git.
helm repo add sealed-secrets https://bitnami-labs.github.io/sealed-secrets
helm install sealed-secrets sealed-secrets/sealed-secrets \
-n kube-systemAfter installation the controller generates an RSA key pair inside the cluster. The user seals the secret with the public key, and the controller decrypts it with the private key.
echo -n "postgres@prod" | kubectl create secret generic myshop-db \
--dry-run=client --from-file=password=/dev/stdin -o yaml \
| kubeseal --controller-namespace kube-system \
--controller-name sealed-secrets \
--format yaml \
> sealedsecret.yamlapiVersion: bitnami.com/v1alpha1
kind: SealedSecret
metadata:
name: myshop-db
spec:
encryptedData:
password: AgB7K8x... # only the cluster's controller can decrypt
template:
type: OpaqueCommit this manifest to git and sync it to the cluster with ArgoCD, and the controller automatically decrypts it to make an ordinary Secret. The first of the three options pointed at in Chapter 20, GitOps §“The single-source model for secrets” is the tool of this section.
The trade-offs of sealed-secrets #
- Pros — no external dependency. It ends with one controller inside the cluster.
- Cons — the controller’s private key is inside the cluster, so a cluster backup is the key backup. Since the keys of dev / staging / prod differ, SealedSecret manifests are not compatible across environments.
- Rotation — on secret rotation you have to re-seal and commit to git — a manual step.
In a small-team + single-cluster environment it’s the simplest option. When you need to share manifests across environments / automatic rotation, the next tool is suitable.
external-secrets — sync with an external secret store #
It’s the Operator pattern of Chapter 18, CRD and Operator, and the tool covered in Chapter 23, DB Integration. It auto-syncs secrets from AWS Secrets Manager / HashiCorp Vault / GCP Secret Manager into K8s Secrets.
Comparing the core difference with sealed-secrets gives the following.
| Grain | sealed-secrets | external-secrets |
|---|---|---|
| the secret’s source of truth | git repo | external secret store (AWS SM, etc.) |
| the manifest’s content | the sealed value (ciphertext) | a reference to the secret (path / key) |
| rotation | manual (re-seal + git push) | automatic (renewed in the external store, ESO auto-syncs) |
| manifests across environments | differ per environment (different keys) | identical per environment (only the reference differs) |
| external dependency | none | the cost + availability of AWS SM / Vault |
For secrets that rotate frequently in an operational environment (DB passwords, API keys, etc.), external-secrets is natural. The ESO manifest itself is just as covered in Chapter 23 — the two CRDs ClusterSecretStore + ExternalSecret.
ESO’s rotation — the automatic path #
apiVersion: external-secrets.io/v1
kind: ExternalSecret
metadata:
name: myshop-api-db
spec:
refreshInterval: 1h # check the external store every hour
# ...refreshInterval is the key — ESO periodically polls the external store to detect changes and renew the K8s Secret. Combined with AWS Secrets Manager’s automatic rotation (Lambda-based), password rotation becomes fully automated.
1. AWS Secrets Manager calls a Lambda (every 30 days, etc.)
2. the Lambda applies a new password to RDS + updates Secrets Manager
3. ESO detects the change within 1 hour, renews the K8s Secret
4. Reloader (a separate component) detects the Secret change, rollout-restarts the Pod
5. the myshop-api Pod connects to RDS with the new passwordOnce this cycle runs, quarterly rotation is automated without a human ever touching the password. It’s one of the goals of operational secrets.
SOPS — the simple option for a small team #
Mozilla’s SOPS (Secrets OPerationS) is a tool with a different shape from sealed-secrets / ESO. It encrypts a file locally and commits that file to git.
# generate an age key pair (once)
age-keygen -o ~/.config/sops/age/keys.txt
# write an ordinary secret YAML
cat > secret.yaml <<EOF
apiVersion: v1
kind: Secret
metadata:
name: myshop-db
stringData:
password: postgres@prod
EOF
# encrypt
sops --age $(cat ~/.config/sops/age/keys.txt | grep public | cut -d: -f2) \
--encrypt --in-place secret.yamlThe encrypted file shows no plaintext without the key. It’s safe to commit to git. At apply time, SOPS decrypts it back into an ordinary manifest and then runs kubectl apply.
To combine with ArgoCD you need an auxiliary tool like helm-secrets or argocd-vault-plugin. Combining with AWS KMS lets you delegate key management to AWS, reducing the operational burden.
Where SOPS sits #
- Pros — the simplest. One file = one secret bundle, so the mental model is intuitive.
- Cons — no automatic rotation, and key management across environments takes effort. As secrets grow, file management becomes cumbersome.
It’s natural for a small team’s single environment + fewer than 10 secrets. As secrets grow or rotation automation becomes necessary, moving to ESO is the natural flow.
The decision tree for the three tools #
- want to finish inside the cluster without using an external secret store
-> sealed-secrets
- already have an external store like AWS / GCP / Vault
+ automatic rotation is an operational requirement
-> external-secrets (ESO)
- small team + single environment + few secrets
+ like the simplicity of one file = one secret
-> SOPS
- a combination of the above three (SOPS for infra secrets, ESO for app secrets, etc.)
-> possible, but the operational burden increasesThis book’s standard path is exactly as covered in Chapters 21 ~ 26 — AWS Secrets Manager + External Secrets Operator. It’s the operational standard for an EKS environment, and the automatic RDS secret sync of Chapter 23 is the real application of this model.
The “zero passwords” operation combined with IRSA #
The most advanced pattern is eliminating the password itself. The IRSA covered in Chapter 16, RBAC / ServiceAccount in Depth is the foundation of this pattern.
The combination of two grains #
A production workload’s external credentials roughly split into two.
[AWS API calls — S3, Secrets Manager, CloudWatch]
-> IRSA + projected token + STS AssumeRoleWithWebIdentity
-> no static key, the token auto-rotates every hour
[DB connection — RDS PostgreSQL / MySQL]
-> RDS IAM auth + a 15-minute IAM token
-> no DB password, authentication by IAM permissionCombining the two pieces creates an operational model where there is no permanent secret anywhere in myshop-api. You don’t need to worry about password rotation, and all access is recorded in CloudTrail.
Applying RDS IAM auth #
import os
import boto3
import psycopg2
def get_db_connection():
rds_client = boto3.client("rds")
token = rds_client.generate_db_auth_token(
DBHostname=os.environ["DB_HOST"],
Port=5432,
DBUsername=os.environ["DB_USER"],
Region="ap-northeast-2",
)
return psycopg2.connect(
host=os.environ["DB_HOST"],
port=5432,
user=os.environ["DB_USER"],
password=token, # not a password but an IAM token
dbname=os.environ["DB_NAME"],
sslmode="require",
)Here boto3.client("rds") is automatically authenticated with IRSA’s projected token, and generate_db_auth_token makes a 15-minute token. Neither a K8s Secret nor an AWS Secrets Manager secret is needed.
The limits of “zero passwords” #
- Token expiry — it must be renewed every 15 minutes, so combining with a long-lived connection is tricky. If you put a pooler in between, the pooler itself has to receive the token.
- PostgreSQL user setup needed — you have to add the user to the
rds_iamgroup and set up grants. - Not every DB supports it — Aurora MySQL / PostgreSQL support it; some old RDS engines don’t.
- Hard to use together with PgBouncer transaction pooling — the issue pointed at in Chapter 23 §“The trap of transaction pooling.”
Because of these limits, this book’s standard path is the combination of traditional passwords + Secrets Manager + ESO + IRSA. “Zero passwords” is an option to apply to some workloads in an environment with stricter security. You have to evaluate the balance of operational burden and security strength as a whole.
Combining with RBAC — separation per namespace #
Not only the operation of the secret itself but also who can read that secret is at the core of secret security. The RBAC model of Chapter 14, RBAC / NetworkPolicy / ResourceQuota is the key of this section.
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: secret-reader
namespace: myshop
rules:
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get", "list", "watch"]If you grant this Role only to a ServiceAccount in the myshop namespace, workloads in other namespaces cannot read myshop’s secrets. It’s the basic setup for per-team isolation inside one cluster.
Disabling the ServiceAccount token #
The pattern of Chapter 14 §“Unmounting the automatic mount of the ServiceAccount token” is the last safety line of the security model.
apiVersion: v1
kind: ServiceAccount
metadata:
name: myshop-api
namespace: myshop
automountServiceAccountToken: falseWith this one line in place, even if a Pod is compromised, there’s no token to access the K8s API directly. Letting only workloads that need IRSA explicitly receive a token, and turning it off for everything else, is a regular recommendation in security guides.
Key separation across dev / staging / prod #
For sealed-secrets, the controller’s keys must be separate per environment to be natural, and for ESO, the trust policy of the per-environment IRSA Role must be separate. If prod’s key is decryptable by the dev controller, it’s a big security hole.
The per-environment separation of Chapter 20, GitOps combines naturally with this section’s key separation. Environment separation must be consistently established not only at the manifest level but also at the secret-key level.
Audit — who accessed what, and when #
Audit is essential for the post-incident analysis of a secret incident. There are three categories of tools.
K8s Audit log #
module "eks" {
# ...
cluster_enabled_log_types = ["api", "audit", "authenticator"]
}When this setting is on, every API request is recorded in the audit log group of CloudWatch Logs. “Which ServiceAccount read the myshop-api-db Secret, and when” becomes traceable.
fields @timestamp, user.username, verb, objectRef.resource, objectRef.name
| filter objectRef.resource = "secrets"
| filter verb in ["get", "list"]
| sort @timestamp desc
| limit 100The standard is to run this query once a quarter to check whether there are any abnormal access patterns. It’s good to add to the quarterly security checkup items of Chapter 26, The Operations Checklist.
AWS CloudTrail — Secrets Manager access #
aws cloudtrail lookup-events \
--lookup-attributes AttributeKey=EventName,AttributeValue=GetSecretValue \
--max-results 50Every call to AWS Secrets Manager is recorded in CloudTrail. You can see who, with which IAM Role, accessed which secret. It’s a tool to verify whether the ESO IRSA Role of Chapter 23 is used consistently.
GuardDuty / Kubescape — anomaly detection #
When GuardDuty’s EKS Protection is on, it automatically detects abnormal secret access patterns (e.g., a newly created ServiceAccount suddenly reading many secrets). Kubescape catches manifest-stage security policy violations (committing a Secret in plaintext, failing to disable the token automount, etc.) at the CI stage.
The secret governance checklist #
We organize a one-page checklist for the quarterly checkup.
[Storage]
- whether EKS encryption-at-rest (KMS) is enabled
- whether there is no plaintext secret in the manifests of the git repo — gitleaks / trufflehog scan
- consistency of the chosen tool among sealed-secrets / ESO / SOPS
[Rotation]
- the list of secrets not rotated for 90+ days
- whether AWS Secrets Manager automatic rotation is enabled (RDS, API keys, etc.)
- whether the rotation-failure alert works
[Injection]
- whether the choice of envFrom vs volumeMount matches the rotation frequency
- integration with Reloader — automating the Pod restart on Secret renewal
- the list of workloads operable at zero passwords with IRSA + RDS IAM auth
[Audit]
- EKS audit log enabled + quarterly CloudWatch Insights checkup
- the Secrets Manager GetSecretValue checkup in CloudTrail
- the alert-handling state of GuardDuty / Kubescape
- the application rate of automountServiceAccountToken: falseThe goal of secret operations is for this checklist to fit on one page and be filled in regularly each quarter. The regular operations calendar of Chapter 26 and this chapter’s checklist together support the security posture of a production cluster.
Exercises #
- Install both sealed-secrets and external-secrets on your dev cluster, and operate the same secret (e.g., a dummy DB password) with each tool separately. Follow a secret-rotation scenario (changing the value) along both paths and compare how many steps sealed-secrets needs versus how many steps ESO needs. Organize the shape of the manifest’s git diff, the change in the ArgoCD UI, and whether a Pod restart occurs into one table.
- Pick one workload of myshop-api and move it to “zero passwords.” Add one database user of RDS to the
rds_iamgroup, and apply code that connects with an IAM token like this chapter’s Python example. In one paragraph, organize the decision rationale, tailored to your scenario, for how to solve the token-expiry problem that arises when combining with PgBouncer — whether to bypass the pooler, or to have the pooler itself receive the token. - Enable the EKS audit log and run this chapter’s secret-access query in CloudWatch Insights. Classify the results over a week into normal / abnormal patterns, and look for items suspected to be abnormal (an unexpected ServiceAccount accessing myshop’s Secret, a large volume of GetSecretValue at dawn, etc.). Write one manifest that auto-detects the discovered pattern with a PrometheusRule of Chapter 25, Monitoring · Alerts or a GuardDuty rule.
In one line: A K8s Secret’s base64 is not encryption, and etcd encryption-at-rest protects only the data inside the cluster — the manifest stage is separate. Secret operations are the four axes of storage · rotation · injection · audit, and the tools split into three groups: sealed-secrets (ends inside git) / external-secrets (external-store sync + automatic rotation) / SOPS (the simple choice for a small team). envFrom is simple but needs a Pod restart on rotation, while volumeMount auto-renews the file. The “zero passwords” model of IRSA + RDS IAM auth is the most advanced, but because of the token-expiry and PgBouncer combination it fits only some workloads. RBAC + per-environment key separation +
automountServiceAccountToken: falseare the last security safety line, and the EKS audit log + CloudTrail + GuardDuty are the tools of the audit layer. The goal of operations is for the quarterly secret governance checklist to fit on one page.
Next chapter #
If this chapter dealt with the shape of secrets, the next chapter is the shape of time. K8s ships a minor version every quarter, and EKS standard support lasts 14 months. At least one minor upgrade a year is an essential operational cycle, and the manual for running that cycle safely is the body of the next chapter.
Chapter 30, Upgrade Strategy covers the EKS upgrade flow briefly pointed at in Chapter 26, The Operations Checklist. It covers the order control plane → data plane → add-ons, deprecated API detection (pluto · kubent · the apiserver_requested_deprecated_apis metric), the safety devices of node drain (PDB · terminationGracePeriodSeconds), minimizing the blast radius, rollback scenarios, and the checklist for the week before / the day of / the week after the upgrade.