Kubernetes: From Basics to EKS in Production book cover
Book

Kubernetes: From Basics to EKS in Production

From your first kubectl command to GitOps and observability — Kubernetes in one book

In progress 32 chaptersLast updated: May 21, 2026
Start from chapter 1 →

What this book covers #

  1. One continuous flow from fundamentals to operations — from kubectl get pods to RBAC · Admission Controller · CRD · GitOps · observability · EKS operations, the core concepts of Kubernetes connect naturally inside a real operational flow.
  2. EKS at the center — local environments like minikube · kind are covered too, but the book centers on real operations on AWS EKS. It extends naturally into the cloud operational experience that local study alone cannot provide.
  3. It doesn’t just show you YAML — it explains why a resource takes the shape it does, which controller maintains the state, and where to start diagnosing when something breaks.
  4. Operations covered in depth — security through RBAC · NetworkPolicy · Admission, observability with Prometheus · Grafana · Loki · OpenTelemetry, debugging, cost optimization, upgrades, and backup and recovery — the operational topics that matter most are covered in the latter half of the book.
  5. It finishes as one fullstack system — in the Part 6 capstone, the Next.js app from React and the FastAPI app from Modern Python are deployed together on one EKS cluster. You can see directly how the techniques from Chapters 1 ~ 30 connect and operate inside a real service.

What this book does not cover #

  • Kubernetes contribution / kubelet internals / etcd operation and other cluster-component details are covered in a separate book.
  • Deep topics such as writing Helm charts, ArgoCD ApplicationSet patterns, and the Cilium eBPF data plane belong in a later K8s deep-dive book. This book covers only how to use them.
  • Multi-cluster federation / service mesh (Istio / Linkerd) depth is covered in a later book.
  • Differences among managed offerings other than EKS (GKE / AKS) are covered only as a mapping table in the appendix.

Who this book is for #

  • Those who know containers but are new to Kubernetes — developers who have come as far as Docker / docker-compose and are stuck at “is it k8s now?” Appendix A, “From docker-compose to k8s,” is the starting point.
  • Those who have used kubectl but do not know operations — people who are comfortable with Pod / Deployment / Service but lack experience with RBAC · observability · GitOps · cost · upgrades. Parts 3 ~ 5 are the core stretch.
  • Those who need to adopt k8s on AWS — infra / backend engineers who must run production workloads on EKS. Part 4, EKS in Production, is the direct guide.
  • Infra operators / DevOps · SRE entry track — Part 5 (operations · debugging · cost) + Part 6 (fullstack EKS deployment) serve as the manual.

How this book is structured #

The full length is 32 chapters: Chapters 1 ~ 31 plus one Appendix A.

  • Part 1: Getting Started with Kubernetes (7 chapters) — what Kubernetes is · local environments · kubectl and your first Pod · Deployment · Service · ConfigMap/Secret · Namespace, so you can run a small cluster by hand.
  • Part 2: Workloads and Operations (7 chapters) — StatefulSet · PV/PVC · Ingress · resources · health checks · autoscaling · RBAC, so you can operate a variety of workloads.
  • Part 3: Depth (6 chapters) — CNI · RBAC in depth · Admission Controller · CRD/Operator · observability · GitOps, expanding into the operator’s view.
  • Part 4: EKS in Production (6 chapters) — EKS setup · app deployment skeleton · RDS integration · CI/CD · monitoring/alerts · operations checklist, for one full cycle of real operations on AWS.
  • Part 5: Operations · Debugging · Cost (4 chapters) — kubectl debugging patterns · cost optimization · secret operations · upgrade strategy, the four subjects you meet while running things.
  • Part 6: Capstone (1 chapter) — deploy the modern-python (FastAPI) and modern-react (Next.js) apps on one EKS cluster, applying Ingress · Helm · ArgoCD · observability · IRSA · External Secrets together.
  • Appendix A (1 chapter) — a mapping table from docker-compose.yml resources to Kubernetes resources, plus seven differences that commonly trip you up during migration.

The series this book is built from #

This book is built from the 26 parts of the series below, plus 5 new chapters (4 in Part 5, 1 in Part 6), Appendix A, and a full revision pass. The series below are still on the site for free.

The book reorganizes the series above into a path from fundamentals to EKS in production, adding four operations · debugging · cost chapters + a fullstack EKS capstone + a docker-compose migration appendix to make it one volume. The core is a 30 %+ new/revised ratio and the fullstack capstone.

Tools that pair well #

Nearly every chapter of this book has you writing YAML manifests by hand. A single misplaced indent or unbalanced quote will make kubectl apply fail in ways that are hard to trace. Before you apply a manifest to the cluster, paste it once into utilrepo’s YAML validator to check that the syntax is sound. utilrepo is a lightweight collection of browser-based web utilities; nothing secret leaves your machine, and it catches common traps like multi-document manifests joined by --- and mixed tabs and spaces.

How this book is funded #

This book is funded by site ads (AdSense) and reader support. There is no purchase flow, and all 32 chapters are open to read on the site.

If a chapter helps you, you can support the book on Ko-fi. Reader support makes the next minor revisions and the next book possible.

❤️ Support on Ko-fi (from $1)

Frequently asked questions #

How is this different from the AWS book? #

AWS (forthcoming) and this book are sibling products. The same fullstack app (modern-python + modern-react) is deployed in the Part 6 capstone on different platforms — this book on EKS, AWS on ECS Fargate. Reading the two side by side makes the operational difference between “managed containers vs Kubernetes” clear.

Will the book be obsolete when a major Kubernetes version ships? #

The core model (Pod / Deployment / Service / Ingress / RBAC / Operator / GitOps) should hold for at least the next 2 ~ 3 years of major versions. Changed APIs and deprecation notices are folded into minor revisions (v1.x), and at a major change (for example, a hypothetical k8s 2.0), a separate v2 line can begin.

How much does following the EKS parts cost? #

Following Part 4 (EKS in Production) + Part 6 (capstone) runs about $40 ~ $80 per month of EKS cluster cost (2 t3.medium nodes + the EKS control plane + ALB + RDS db.t3.micro). EKS infrastructure cost is billed directly to your AWS account, so each chapter ends with teardown commands that let you delete the cluster immediately after the exercise. The Part 6 intro also explains ways to minimize that cost within the AWS free tier.

Where can I get the book’s code? #

Each chapter’s manifests and Terraform code are written directly in the main text as code blocks, so we recommend typing them out by hand. The finished version of the Part 6 capstone’s 13 steps will be provided separately as a GitHub repository (modern-kubernetes-capstone). We’ll add the link to this book page once it’s ready.

Can I read this in languages other than English? #

Korean, Japanese, and English editions all share the same 32-chapter structure. You can read each from its own book page.

Isn’t a chapter more than 70 % the same as a post on the site? #

Some chapters share a topic with the source series posts. But each book chapter is rewritten — (1) re-narrated to fit the book’s flow, (2) unified to the Kubernetes 1.32 + EKS baseline, (3) cross-linked with other chapters in the book, and (4) given exercises and a one-line summary — so even on the same topic the result reads as distinct material. The 5 new chapters (4 in Part 5 + the Part 6 capstone) and Appendix A are not in the source series.

How do I send support or feedback? #

Feedback is welcome through blog comments or email. Typos, improvement suggestions, and manifest error reports per chapter are folded quickly into the next minor revision. Support runs through the Ko-fi channel, from $1 and up.

What’s next #

All 32 chapters of this book in ko · ja · en are published. Updates will proceed as follows.

  1. Stabilizing the v1 text — over the 4 ~ 8 weeks after release, reader feedback folds typos, manifest errors, and thin explanations into minor revisions (v1.x).
  2. Part 6 capstone GitHub repository — the finished 13 steps will be organized into a separate repository, with the link added to this book page.
  3. Regular Kubernetes / EKS updates — the price tables and instance types in the Part 4 EKS chapters are reviewed on a six-month cadence. A v2 book line can start at a Kubernetes major change.

You can subscribe to new-chapter and major-revision notices via the RSS feed.

Contents

Part 1: Getting Started with Kubernetes 7 Chapter

What Kubernetes is · local environments · kubectl and your first Pod · Deployment / Service · ConfigMap / Secret · Namespace — the seven topics a beginner needs to get hands-on with.

  1. 1. What Kubernetes Is Why you need a container orchestrator. Starting from a reader who has used Docker / docker-compose, this chapter lays out the five limits of single-container tooling, the declarative desired state + reconcile loop model, the big picture of control plane / worker node, and the scope of the book.
  2. 2. Local Environment Choose between minikube · kind · Docker Desktop k8s. Compare how each option works and the pros and cons of each, then install kubectl and bring up your first cluster with kind to check the nodes and system Pods — all in one pass.
  3. 3. kubectl and Your First Pod Build the mental model of kubectl and bring up your first Pod. From one imperative cycle of kubectl run to the declarative YAML manifest, the everyday commands get / describe / logs / exec, the Pod lifecycle, and common failure patterns like ImagePullBackOff · CrashLoopBackOff.
  4. 4. Deployment and ReplicaSet Cover declarative deployment and rolling updates. Build the relationship among the three tiers Deployment / ReplicaSet / Pod, self-healing with replicas: 3, RollingUpdate's maxSurge / maxUnavailable, rollout undo rollback, and the workloads Deployment doesn't solve (StatefulSet · DaemonSet · Job) — all together.
  5. 5. Service The abstraction that solves the problem of Pod IPs being temporary — the Service. A stable ClusterIP · selector · Endpoints / EndpointSlice, the criteria for choosing among the three types ClusterIP · NodePort · LoadBalancer, kube-proxy's DNAT, and CoreDNS's short-name resolution.
  6. 6. ConfigMap and Secret Separate config and passwords from the manifest with ConfigMap and Secret. This is how Kubernetes solves 12-factor's "store config in the environment" principle, the three injection methods env · envFrom · volume, the fact that a Secret's base64 is not encryption, and why a Pod restart is needed when config changes.
  7. 7. Namespace and Labels Organize the model of splitting one cluster with namespaces and the syntax of labels · selectors. The limits of `default`, the four system namespaces, the namespace as the unit of RBAC · ResourceQuota · NetworkPolicy, the `kubens` operational tip, the `app.kubernetes.io/*` standard labels, and the selector syntax of `kubectl -l` — closing Part 1.
Part 2: Workloads and Operations 7 Chapter

StatefulSet · PV/PVC · Ingress · resources · health checks · autoscaling · RBAC — moving from a small cluster to operating a variety of workloads.

  1. 8. StatefulSet / DaemonSet / Job / CronJob A walkthrough of the controllers that handle the four kinds of workload Deployment's stateless assumption cannot express. StatefulSet's identity and 1:1 PVC, DaemonSet's one-per-node model, Job's termination model, and CronJob's cron scheduling with the concurrencyPolicy · startingDeadlineSeconds safeguards.
  2. 9. PV / PVC / StorageClass A walkthrough of the persistent-data model that survives beyond a Pod's lifecycle. The PV · PVC · StorageClass triangle, static · dynamic provisioning, accessModes (RWO · RWX · RWOP), reclaimPolicy, volumeBindingMode's WaitForFirstConsumer, allowVolumeExpansion, and what a StatefulSet's volumeClaimTemplates creates on top of this model.
  3. 10. Ingress and the Ingress Controller An abstraction for how external traffic reaches a Service inside the cluster. It covers the two-layer separation of the Ingress object and the Ingress Controller, host · path · pathType-based routing, TLS termination and cert-manager, IngressClass, and the successor standard, the Gateway API.
  4. 11. resources.requests / limits A walkthrough of the model of how a container requests CPU and memory and how it's given an upper bound. The separation of requests and limits, the QoS classes (Guaranteed · Burstable · BestEffort), the difference in behavior between CPU throttling and memory OOMKilled, the cgroup awareness of the JVM · Go runtimes, the namespace policies of LimitRange · ResourceQuota, and the operational cycle of setting initial values and adjusting them.
  5. 12. Health Checks A walkthrough of how Kubernetes judges whether a container is alive and ready to receive traffic. It covers the role separation of the three probes, liveness · readiness · startup; the httpGet · tcpSocket · exec check methods; tuning parameters such as initialDelaySeconds · periodSeconds · failureThreshold; the cascading failure that happens when you put an external dependency in liveness; and graceful shutdown with terminationGracePeriodSeconds and the preStop hook.
  6. 13. Autoscaling A walkthrough of the three dimensions of automatic adjustment that absorb a production cluster's load swings without human intervention. The roles of HPA (Pod count) · VPA (Pod resources) · Cluster Autoscaler (node count), the metrics-server prerequisite, HPA's autoscaling/v2 manifest and proportional algorithm, the scale-up · scale-down asymmetry, custom metrics and KEDA, VPA's updateMode and the HPA · VPA conflict, and Karpenter.
  7. 14. RBAC / NetworkPolicy / ResourceQuota A walkthrough of the three policy objects that create isolation for multi-tenant operations where several teams · environments live together in one cluster. RBAC's Role · ClusterRole · ServiceAccount · RoleBinding model, NetworkPolicy's default-deny pattern and CNI dependency, and the pairing of ResourceQuota and LimitRange — all in one chapter, closing Part 2.
Part 3: Depth — Security · Extensibility · Observability 6 Chapter

CNI in depth · RBAC in depth · Admission Controller · CRD/Operator · Prometheus/Grafana/Loki · GitOps — expanding into the operator's point of view.

  1. 15. CNI in Depth How the same NetworkPolicy manifest resolves into iptables rules on Calico and into eBPF programs on Cilium — the depth of the data plane. We cover the four conditions of the Kubernetes network model, what the CNI interface actually is, the three data plane models (iptables · IPVS · eBPF), a comparison of Calico and Cilium, and the practical criteria for choosing a CNI.
  2. 16. RBAC / ServiceAccount in Depth On top of the basics of Chapter 14's RBAC, we add another layer of depth you meet in a production cluster. We organize Aggregated ClusterRole that merges ClusterRoles by label, Impersonation that calls with another subject's permissions, the flow by which a ServiceAccount token moved from a permanent Secret to a projected token with expiry · audience · rotation, and the model that ties a Kubernetes ServiceAccount to cloud IAM via EKS's IRSA · GKE's Workload Identity.
  3. 17. Admission Controller We cover the admission model, where the Kubernetes API server inspects and transforms a manifest just before storing it in etcd. We organize the two types, Mutating and Validating; the built-in controllers (LimitRanger · ResourceQuota · PodSecurity, etc.); the webhook mechanism; and a comparison of the two policy engines built on top of it, OPA Gatekeeper (Rego) and Kyverno (YAML).
  4. 18. The CRD and Operator Pattern We cover the two axes of extending the K8s API into objects of your own domain. You define a new object kind with a CustomResourceDefinition, and a controller-runtime-based Operator hangs the reconcile loop from Chapter 1 over that object, extending K8s's declarative model all the way to your domain. We organize the three standard patterns of ownerReference · finalizer · status subresource and the build tools Kubebuilder · Operator SDK.
  5. 19. Observability We organize the three axes that give a production cluster visibility — metrics (Prometheus + kube-state-metrics + node-exporter), logs (Loki), and traces (OpenTelemetry + Tempo) — together with the standard visualization stack (Grafana) and alerting (Alertmanager). We cover the ServiceMonitor · PrometheusRule pieces of kube-prometheus-stack, examples of PromQL · LogQL, and the operational guardrails of cardinality · retention · alert SNR · golden signals.
  6. 20. GitOps We cover the operational model where the source of truth for manifests sits in git and a controller inside the cluster watches git to sync automatically. We wrap up Part 3, organizing the difference between the push model and the pull model, the four principles of GitOps, ArgoCD's Application CRD · App of Apps · Sync Wave, Flux's GitRepository · Kustomization · HelmRelease, directory-structure patterns, and the three standard tools for putting secrets in git.
Part 4: EKS in Production 6 Chapter

EKS cluster setup · app deployment skeleton · RDS integration · CI/CD · monitoring/alerts · operations checklist — one full cycle of real operations on AWS.

  1. 21. EKS Cluster Setup We cover the flow of standing up a real production cluster on AWS EKS from scratch. With Terraform we declare the VPC · EKS control plane · node group · IRSA · essential add-ons (VPC CNI · CoreDNS · kube-proxy · EBS CSI) in one codebase, and we wrap up eksctl's quick-setup option, Karpenter's node autoscaling, and the first checks · cost model into a single chapter.
  2. 22. App Deployment Skeleton We deploy the sample service myshop-api onto the empty EKS cluster stood up in Chapter 21 as a set of manifests. We organize the 9 objects Namespace · ServiceAccount · ConfigMap · Secret · Deployment · Service · Ingress · HPA · PodDisruptionBudget into a single flow, and auto-provision an ALB with the AWS Load Balancer Controller. We follow all the way through to abstracting that set into a Helm chart and applying it to dev / prod with different values.
  3. 23. DB Integration — RDS · External Secrets The myshop-api we exposed externally in Chapter 22 is an empty shell with no data store. This chapter fills that space. We stand up RDS PostgreSQL with Terraform, keep the master password in AWS Secrets Manager, auto-sync that secret into a Kubernetes Secret with the External Secrets Operator, grant permissions without static credentials via IRSA, add a connection pool with PgBouncer, and automate schema migrations with a Helm hook-based Job pattern.
  4. 24. CI/CD Pipeline The myshop-api built through Chapter 23 still relies heavily on humans when a new version comes in. This chapter automates that process. With OIDC trust, GitHub Actions pushes a container image to AWS ECR without static keys, auto-commits the Helm values in the manifest repo, and ArgoCD, covered in Chapter 20, detects that change and syncs it to the cluster. We also cover PR approval gates, the dev / prod split, Argo Rollouts canary deployment, and image tag immutability.
  5. 25. Monitoring · Alerts The myshop-api built through Chapter 24 is automated from code to deployment, but if you cannot see its behavior, operations do not move. This chapter layers on the EKS cluster's observability stack. We install Prometheus · Grafana · Alertmanager at once with kube-prometheus-stack, standardize myshop-api metrics and the 4 golden signals alerts with ServiceMonitor / PrometheusRule, capture logs with Loki, keep AWS-coupled metrics and long-term retention with CloudWatch Container Insights, and organize the on-call flow of Slack / PagerDuty with severity · team routing.
  6. 26. Operations Checklist The last chapter of Part 4 (EKS in Production). Standing up a cluster reliably and operating it safely over a year are different kinds of work. We organize the EKS minor upgrade cycle, the node-group replacement pattern, RDS PITR and quarterly recovery drills, the path of taming cost with Karpenter + Spot, and the flow of regularizing security checks with kube-bench · Trivy · Kyverno. Finally, we bring together a retrospective on the 6 chapters of Part 4 (Chapters 21 ~ 26) and the 26 chapters of Parts 1 ~ 4.
Part 5: Operations · Debugging · Cost 4 Chapter

kubectl debugging patterns · cost optimization · secret operations · upgrade strategy — four subjects you encounter while running things in production.

  1. 27. kubectl Debugging Patterns The first chapter of Part 5 (Operations · Debugging · Cost). It collects the diagnostic trees for the incidents you meet most often on a production cluster (CrashLoopBackOff, OOMKilled, ImagePullBackOff, Pending, a Service that won't reach). Starting from the three commands describe · events · logs, it ties together kubectl debug's ephemeral container, network diagnostic patterns, and the Chapter 19 observability stack into a manual that becomes a junior SRE's first reference.
  2. 28. Cost Optimization The second chapter of Part 5. It covers the cost items pointed out through five sources in Chapter 26. It ties together the two axes of compute (nodes) and add-ons (LB · storage · network · control plane), the cost meaning of requests, the right-sizing of VPA · Goldilocks · KRR, the decision tree of Spot · Karpenter · Cluster Autoscaler, bin packing and descheduler, the visualization of OpenCost · Kubecost, chargeback / showback by namespace label, and PV · network cost — and it closes with a checklist for reviewing next month's bill.
  3. 29. Secret Operations The third chapter of Part 5. Starting from the base64 limit of a K8s Secret and the meaning of etcd encryption-at-rest, it covers the secret lifecycle along the four axes of storage · rotation · injection · audit. It turns the comparison of sealed-secrets · external-secrets · SOPS, the zero-password operation combined with IRSA (IRSA for the AWS API, RDS IAM auth for the DB), the rotation difference of envFrom vs mount, separation per namespace with RBAC, and the audit viewpoint of the Audit log and GuardDuty into a practical operations manual.
  4. 30. Upgrade Strategy The last chapter of Part 5. An operations manual for safely keeping up with Kubernetes minor releases (14 months of support). It covers the order control plane → data plane (nodes) → add-ons, deprecated API detection (pluto · kubent · apiserver metric), the API-version migration of manifests / Helm / Operator CRs, the node group / Karpenter NodePool drift flow of EKS, the safety devices of node drain (PDB · terminationGracePeriodSeconds), minimizing the blast radius, rollback scenarios, choosing a backup per RPO / RTO, and the checklist for the week before, the day of, and the week after the upgrade.
X