#Kubernetes

136 posts

Thursday, May 21, 2026 13 min read

Observability

We organize the three axes that give a production cluster visibility — metrics (Prometheus + kube-state-metrics + node-exporter), logs (Loki), and traces (OpenTelemetry + Tempo) — together with the standard visualization stack (Grafana) and alerting (Alertmanager). We cover the ServiceMonitor · PrometheusRule pieces of kube-prometheus-stack, examples of PromQL · LogQL, and the operational guardrails of cardinality · retention · alert SNR · golden signals.

kubernetes

Thursday, May 21, 2026 18 min read

Operations Checklist

The last chapter of Part 4 (EKS in Production). Standing up a cluster reliably and operating it safely over a year are different kinds of work. We organize the EKS minor upgrade cycle, the node-group replacement pattern, RDS PITR and quarterly recovery drills, the path of taming cost with Karpenter + Spot, and the flow of regularizing security checks with kube-bench · Trivy · Kyverno. Finally, we bring together a retrospective on the 6 chapters of Part 4 (Chapters 21 ~ 26) and the 26 chapters of Parts 1 ~ 4.

kubernetes

Thursday, May 21, 2026 25 min read

RBAC / NetworkPolicy / ResourceQuota

A walkthrough of the three policy objects that create isolation for multi-tenant operations where several teams · environments live together in one cluster. RBAC's Role · ClusterRole · ServiceAccount · RoleBinding model, NetworkPolicy's default-deny pattern and CNI dependency, and the pairing of ResourceQuota and LimitRange — all in one chapter, closing Part 2.

kubernetes

Thursday, May 21, 2026 14 min read

RBAC / ServiceAccount in Depth

On top of the basics of Chapter 14's RBAC, we add another layer of depth you meet in a production cluster. We organize Aggregated ClusterRole that merges ClusterRoles by label, Impersonation that calls with another subject's permissions, the flow by which a ServiceAccount token moved from a permanent Secret to a projected token with expiry · audience · rotation, and the model that ties a Kubernetes ServiceAccount to cloud IAM via EKS's IRSA · GKE's Workload Identity.

kubernetes

Thursday, May 21, 2026 16 min read

Secret Operations

The third chapter of Part 5. Starting from the base64 limit of a K8s Secret and the meaning of etcd encryption-at-rest, it covers the secret lifecycle along the four axes of storage · rotation · injection · audit. It turns the comparison of sealed-secrets · external-secrets · SOPS, the zero-password operation combined with IRSA (IRSA for the AWS API, RDS IAM auth for the DB), the rotation difference of envFrom vs mount, separation per namespace with RBAC, and the audit viewpoint of the Audit log and GuardDuty into a practical operations manual.

kubernetes

Thursday, May 21, 2026 14 min read

The CRD and Operator Pattern

We cover the two axes of extending the K8s API into objects of your own domain. You define a new object kind with a CustomResourceDefinition, and a controller-runtime-based Operator hangs the reconcile loop from Chapter 1 over that object, extending K8s's declarative model all the way to your domain. We organize the three standard patterns of ownerReference · finalizer · status subresource and the build tools Kubebuilder · Operator SDK.

kubernetes

Thursday, May 21, 2026 17 min read

Upgrade Strategy

The last chapter of Part 5. An operations manual for safely keeping up with Kubernetes minor releases (14 months of support). It covers the order control plane → data plane (nodes) → add-ons, deprecated API detection (pluto · kubent · apiserver metric), the API-version migration of manifests / Helm / Operator CRs, the node group / Karpenter NodePool drift flow of EKS, the safety devices of node drain (PDB · terminationGracePeriodSeconds), minimizing the blast radius, rollback scenarios, choosing a backup per RPO / RTO, and the checklist for the week before, the day of, and the week after the upgrade.

kubernetes

Wednesday, May 20, 2026 25 min read

Autoscaling

A walkthrough of the three dimensions of automatic adjustment that absorb a production cluster's load swings without human intervention. The roles of HPA (Pod count) · VPA (Pod resources) · Cluster Autoscaler (node count), the metrics-server prerequisite, HPA's autoscaling/v2 manifest and proportional algorithm, the scale-up · scale-down asymmetry, custom metrics and KEDA, VPA's updateMode and the HPA · VPA conflict, and Karpenter.

Infrastructure Kubernetes Container Orchestration Certification

Wednesday, May 20, 2026 10 min read

Certified Kubernetes Administrator (CKA) #10 Workloads 1: Deployment in Depth, ReplicaSet, Rolling Update and Rollback

The tenth post in the Certified Kubernetes Administrator (CKA) series. We look deep into the Deployment, the workload an operator handles most often. We walk through the Deployment→ReplicaSet→Pod hierarchy and the label selector that binds them, how to create and scale with kubectl, the conditions under which the rollingUpdate strategy (maxSurge/maxUnavailable) guarantees a zero-downtime update, and the rollback that lets you track versions and revert with kubectl rollout — all drilled until they are second nature.

Infrastructure Kubernetes Container Orchestration Certification

Wednesday, May 20, 2026 10 min read

Certified Kubernetes Application Developer (CKAD) #5 Workloads 1: Deployment, ReplicaSet, Rolling Update, and Rollback

The fifth post in the Certified Kubernetes Application Developer (CKAD) series. We create a Deployment imperatively—the heart of app delivery—and lay out the relationship and scaling of Deployment, ReplicaSet, and Pod. We will get hands-on with the meaning of rollingUpdate's maxSurge and maxUnavailable, the flow of shipping a new version with kubectl set image, and the rollback scenario of tracking state with kubectl rollout and reverting a failed version with undo.

Infrastructure Kubernetes Container Orchestration Certification

Wednesday, May 20, 2026 11 min read

Certified Kubernetes Security Specialist (CKS) #3: CIS benchmark (kube-bench), component security, Ingress TLS, binary verification

The third post in the Certified Kubernetes Security Specialist (CKS) series. It covers the remaining half of the Cluster Setup domain — hardening the cluster itself. We get hands-on, with commands and manifests, on what the CIS Kubernetes benchmark is, how to inspect the control plane and nodes with kube-bench and read the PASS/FAIL/WARN results and apply remediation, the procedure for changing dangerous apiserver and kubelet flags to safe values, how to attach TLS to an Ingress, and the flow for verifying a downloaded binary with sha256sum.

kubernetes

Wednesday, May 20, 2026 17 min read

ConfigMap and Secret

Separate config and passwords from the manifest with ConfigMap and Secret. This is how Kubernetes solves 12-factor's "store config in the environment" principle, the three injection methods env · envFrom · volume, the fact that a Secret's base64 is not encryption, and why a Pod restart is needed when config changes.