#Kubernetes

136 posts

Certified Kubernetes Administrator (CKA) #2 Cluster Architecture 1: Control plane (apiserver/etcd/scheduler/controller-manager)
12 min read

Certified Kubernetes Administrator (CKA) #2 Cluster Architecture 1: Control plane (apiserver/etcd/scheduler/controller-manager)

The second post in the Certified Kubernetes Administrator (CKA) series. We look at how a cluster actually runs, starting from the control plane. We cover what kube-apiserver (the gateway for all communication), etcd (the cluster state store), kube-scheduler (the Pod placement decision), and kube-controller-manager (the reconciliation loop) each do, how the control plane runs as static Pods, and what happens to the cluster when a component dies — all from an operator's point of view.

Certified Kubernetes Administrator (CKA) #1: The Exam Environment — alias and dry-run, vim/yq setup, time management
8 min read

Certified Kubernetes Administrator (CKA) #1: The Exam Environment — alias and dry-run, vim/yq setup, time management

The opening post of the Certified Kubernetes Administrator (CKA) series. We lay out the structure of the 2-hour hands-on exam, the weight of the five domains (Troubleshooting at 30% is the crux), the passing line, and the testing environment — then drill the setup (alias, dry-run, vim/yq, etcdctl, systemctl) that decides how your exam time runs. This 27-part series targets a CKA pass, wrapping up with a hands-on mock exam in #27.

K8s Practice #6: Operations Checklist — Upgrades / Backup,Recovery / Cost / Security
13 min read

K8s Practice #6: Operations Checklist — Upgrades / Backup,Recovery / Cost / Security

The last post in the K8s Practice series. Bringing up a cluster and operating it safely for a year are different kinds of work. This post organizes the EKS upgrade cycle, node group replacement pattern, RDS automated backup and PITR, cost management with Karpenter and Spot, and regular security checks with kube-bench and Trivy. It also includes a retrospective of the 6-post K8s Practice series and the full 26-post K8s track.

K8s Practice #5: Monitoring & Alerting — Prometheus / CloudWatch / Alertmanager
11 min read

K8s Practice #5: Monitoring & Alerting — Prometheus / CloudWatch / Alertmanager

The `myshop-api` built in [#4](/en/posts/k8s-practice-4) now has code-to-deploy automation, but operations do not work unless you can see what it is doing. This post organizes the EKS cluster observability stack. We install Prometheus + Grafana + Alertmanager at once with kube-prometheus-stack, combine that with CloudWatch via Container Insights and Fluent Bit, standardize myshop-api metrics and alerts via ServiceMonitor / PrometheusRule, and organize the on-call flow with the 4 golden signals rule set and Slack / PagerDuty routing.

K8s Practice #4: CI/CD Pipeline — GitHub Actions / ECR / ArgoCD
10 min read

K8s Practice #4: CI/CD Pipeline — GitHub Actions / ECR / ArgoCD

The `myshop-api` built in [#3](/en/posts/k8s-practice-3) still relies on manual steps whenever a new version is released. This post automates that process. GitHub Actions pushes container images to AWS ECR via OIDC without static keys, auto-commits Helm values in the manifest repo so the ArgoCD covered in [Advanced #6](/en/posts/k8s-advanced-6) can detect the change and sync to the cluster, and keeps PR approval gates, dev/prod branching, and canary deployment in one flow.

K8s Practice #3: DB Integration — RDS / Secrets Manager / External Secrets / Connection Pool
10 min read

K8s Practice #3: DB Integration — RDS / Secrets Manager / External Secrets / Connection Pool

The `myshop-api` exposed in [#2](/en/posts/k8s-practice-2) is still an empty shell with no data store. This post organizes the flow of bringing up RDS PostgreSQL with Terraform, storing the master secret in AWS Secrets Manager, auto-syncing it into a K8s Secret with External Secrets Operator, accessing AWS without static credentials via IRSA, and adding PgBouncer as a connection pool. It also covers automating schema migration as a Job.

K8s Practice #2: App Deployment Skeleton — Deployment / Service / Ingress / Helm
10 min read

K8s Practice #2: App Deployment Skeleton — Deployment / Service / Ingress / Helm

The stage of putting `myshop-api` on the empty EKS cluster brought up in [#1](/en/posts/k8s-practice-1). We organize Deployment / Service / Ingress / ConfigMap / Secret / ServiceAccount / HPA as one bundle, auto-provision an ALB with AWS Load Balancer Controller, and package the bundle as a Helm chart so the same chart deploys to dev and prod with different values.

K8s Practice #1: EKS Cluster Setup — Terraform / eksctl / IRSA / Addons
12 min read

K8s Practice #1: EKS Cluster Setup — Terraform / eksctl / IRSA / Addons

The first post in the K8s Practice series. We follow the path of building a real operational cluster rather than a toy abstraction. Defining the VPC and EKS cluster with Terraform, setting up node groups and IRSA, laying on the essential addons (VPC CNI, CoreDNS, kube-proxy, EBS CSI), and comparing eksctl as a faster setup option along the way. The starting point for the imaginary service myshop-api used throughout the 6-post series.

K8s Advanced #6: GitOps — ArgoCD / Flux
11 min read

K8s Advanced #6: GitOps — ArgoCD / Flux

The last post in the K8s Advanced series. GitOps — the operational model where the source of truth for manifests lives in git and a controller inside the cluster watches git to sync automatically. Covers the difference between push and pull models, ArgoCD's Application CRD and sync wave, Flux's Source / Kustomization / HelmRelease, directory structure patterns, and how to safely store secrets in git via Sealed Secrets / External Secrets. Also includes a 6-post K8s Advanced retrospective and a preview of the next track, K8s Practice.

K8s Advanced #5: Observability — Prometheus / Grafana / Loki / OpenTelemetry
10 min read

K8s Advanced #5: Observability — Prometheus / Grafana / Loki / OpenTelemetry

Operational cluster observability is composed of three axes — metrics, logs, and traces. The K8s standard stack for each axis is nearly settled. Metrics with Prometheus + kube-state-metrics + node-exporter, logs with Loki (or EFK), traces with OpenTelemetry, visualization with Grafana, alerting with Alertmanager. This post organizes the three-axis model, the standard components for each axis, and operational principles like cardinality, retention period, and alert design — all in one cycle.

K8s Advanced #4: CRD and the Operator Pattern — controller-runtime
10 min read

K8s Advanced #4: CRD and the Operator Pattern — controller-runtime

One reason K8s is powerful is that you can extend its API itself. Defining new object kinds with CustomResourceDefinition and writing a reconcile loop for those objects with controller-runtime makes domain objects live as standard resources on top of K8s. Objects with names like PostgresCluster, RedisFailover, KafkaBroker are the result. This post organizes the CRD model, an Operator skeleton based on controller-runtime, and ownerReference / finalizer / status subresource — all in one cycle.

K8s Advanced #3: Admission Controller — OPA Gatekeeper / Kyverno
10 min read

K8s Advanced #3: Admission Controller — OPA Gatekeeper / Kyverno

The K8s API server has a stage that can inspect and mutate manifests right before they're stored in etcd. This stage, called Admission Controller, is the entry point for the operational cluster's policy engine. Policies like "reject containers without limits," "force specific labels," "restrict image origins" are blocked at the manifest level without changing a line of code. This post organizes the position of the admission stage, built-in controllers, ValidatingWebhook and MutatingWebhook, and the models of two policy engines OPA Gatekeeper and Kyverno — all in one cycle.