Kubernetes and Cloud Native Associate (KCNA) #5: Cloud Native Architecture (16%) — Autoscaling, Serverless, Community, Open Standards
In #4 we covered the layers that hold containers up beneath Kubernetes — container runtimes, security, networking, storage, and the service mesh. This post lifts our gaze one level higher. The domain it covers asks not about Kubernetes the tool, but about the cloud native design philosophy Kubernetes belongs to and the ecosystem that surrounds it.
KCNA’s Domain 3, Cloud Native Architecture, carries a weight of 16% — roughly a sixth of the exam. This domain asks not about commands or manifests but about why we design things this way. Why scale resources up and down automatically with load, why use serverless where you don’t manage the infrastructure yourself, why a foundation like the CNCF and open standards matter. This post lays out that philosophy and its vocabulary.
What is cloud native #
The starting point of this domain is the definition of cloud native. The CNCF (Cloud Native Computing Foundation) defines cloud native as a way to build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds. It names containers, service meshes, microservices, immutable infrastructure, and declarative APIs as the representative technologies.
On the exam you need to be able to classify what each keyword in this definition means.
| Keyword | Meaning |
|---|---|
| container | A unit that packages an application together with its dependencies so it runs identically anywhere |
| microservices | A structure that splits one giant application into small, independently deployable services |
| declarative API | A model where you declare the desired state rather than the imperative steps, and the system converges to that state |
| immutable infrastructure | An operating model that replaces a running server wholesale with a new image instead of patching it in place |
| service mesh | A layer that handles inter-service communication, observability, and security outside the application code |
Two more operational traits underpin this definition.
- Self-healing. When a container or node dies, the system recreates and recovers it on its own. The reconciliation loop, in which Kubernetes controllers continually close the gap between the desired state and the current state, is exactly the implementation of self-healing.
- Resilience. The system is designed to keep working even when some components fail. Replication, fault isolation, and retries are the means to that end.
It is also worth being familiar with the 12-factor app, a set of cloud native application design principles. It is a set of twelve guidelines — separating configuration into environment variables, building applications as stateless processes, treating logs as event streams, and so on — that recommend how to build applications that are easy to scale and deploy on the cloud.
Autoscaling #
Cloud native’s core promise is that it scales resources up and down automatically to match load. The Kubernetes ecosystem divides this autoscaling across several layers, and KCNA asks whether you can pin down precisely what each one adjusts.
HPA (Horizontal Pod Autoscaler) #
HPA scales the number of Pods horizontally, up and down. When CPU utilization, memory, or a custom metric exceeds a configured target, it raises the Pod replica count; when load drops, it lowers it again. “When traffic spikes, run multiple copies of the same Pod” is the picture of HPA. It is the most commonly used autoscaler and the one that appears most often on the exam.
VPA (Vertical Pod Autoscaler) #
VPA adjusts the resources allocated to a single Pod (CPU and memory requests/limits) vertically. Instead of the number of Pods, it grows or shrinks the size of an individual Pod. It automatically recommends and applies an appropriate resource request, reducing waste from over-allocation and performance degradation from under-allocation. Because it conflicts with HPA when both use the same metrics (CPU and memory) at once, the two are usually not applied together to the same workload.
Cluster Autoscaler (CA) #
CA adjusts the number of nodes. When a Pod has nowhere to be scheduled and stays Pending, it asks the cloud provider to add nodes; when nodes sit empty, it removes them to cut costs. When HPA adds Pods but there aren’t enough nodes to place them on, CA adds nodes — so HPA and CA cooperate across different layers.
KEDA (Kubernetes Event-Driven Autoscaling) #
KEDA is a CNCF project that handles event-driven autoscaling. Beyond CPU and memory, it scales Pods using external event sources as metrics — the backlog length of a message queue, the lag of a Kafka topic, the result of a database query. The key point is that it supports scale-to-zero, dropping Pods all the way down to zero when there are no events to process. HPA keeps a minimum of one by default, but KEDA can go down to zero and spin back up when an event arrives.
Comparing the four autoscalers #
| Mechanism | What it adjusts | Basis | Characteristics |
|---|---|---|---|
| HPA | Pod count (horizontal) | CPU, memory, custom metrics | Most common. Keeps a minimum of one |
| VPA | Pod resource size (vertical) | CPU and memory usage patterns | Conflicts with HPA on the same metrics |
| Cluster Autoscaler | Node count | Pending Pods / idle nodes | Directly affects infrastructure cost |
| KEDA | Pod count (event-driven) | Queue length, Kafka lag, other external events | Supports scale-to-zero |
The exam point is clear. HPA is horizontal (count), VPA is vertical (size), CA is nodes, and KEDA is events and scale-to-zero. Keep that one line straight and most questions in this area fall into place. I have written up the hands-on flow of running these myself in Practical Track Intermediate #6 Autoscaling.
How autoscaling works #
For autoscaling to be possible, there has to be a metric source that measures the current load. HPA and VPA read CPU and memory usage from the cluster’s metrics-server or a custom metrics API, compute the ratio of the target to the current value, and derive the desired replica count. Reflecting that derived value in the Deployment’s replica count is one cycle of HPA.
The thing to note here is that the autoscaler, too, operates on top of the declarative model. HPA does not issue a one-shot “add Pods” command; it keeps recalculating the desired replica count and converges the current state toward that value. The stabilization window, which keeps the Pod count from oscillating wildly when load fluctuates, exists for this very reason. Like self-healing, autoscaling is best understood as a form of the reconciliation loop, which makes the context of the exam questions easy to grasp.
Serverless #
Serverless is a model where you run code (a function or a container) without managing the infrastructure yourself. Despite the name, it does not mean there are no servers; it means the cloud provider or platform takes over operations such as provisioning, scaling, and patching servers, so that the developer never has to think about servers.
The characteristics of serverless are as follows.
- It scales to zero when there are no requests (scale-to-zero). Because it consumes no resources while idle, it is cost-efficient.
- You pay only for what you use. Cost is charged based on execution time and the number of invocations.
- It is event-driven. Events such as an HTTP request, an arriving message, or a file upload trigger a function’s execution.
FaaS and Knative #
The representative form that implements serverless is FaaS (Function as a Service). Deploy code as functions and the platform runs and scales them according to invocations. Managed FaaS from public clouds is widely used, but what matters more in KCNA is projects that implement serverless on top of Kubernetes.
- Knative. A CNCF platform for running serverless workloads on top of Kubernetes. It provides scale-to-zero, dropping Pods to zero when there are no requests and automatically spinning them back up when a request arrives, along with event-routing (Eventing) capabilities. The answer to a question about “serverless on top of Kubernetes” is almost always Knative.
- OpenFaaS. Another serverless framework that packages and runs functions as containers on top of Kubernetes.
When is serverless a good fit #
Serverless fits bursty or intermittent workloads, short tasks that react to events, and prototypes you build and ship quickly to try out. Conversely, it is a poorer fit for a steady, always-constant load, for low-latency services that can’t tolerate cold-start delay, and for long-running tasks.
| Situation | Serverless fit |
|---|---|
| Intermittent, unpredictable traffic | Good fit. Scales to zero when idle to cut costs |
| Event-triggered processing (uploads, messages) | Good fit. Matches the event-driven execution model |
| Always high, constant load | Poorer fit. With constant execution, the cost advantage shrinks |
| Latency-sensitive services | Caution. You must weigh cold-start delay |
| Long-running batch jobs | Poorer fit. Easily hits execution-time limits |
A cold start is the delay it takes for a workload that had scaled to zero to spin back up on its first request. It is the price of the cost savings that scale-to-zero provides, and as serverless’s signature tradeoff it is a concept likely to appear on the exam.
Community and governance #
The cloud native ecosystem is not any single company’s product but a community of hundreds of projects gathered under the CNCF, an open-governance foundation. KCNA asks about the structure of this community.
The role of the CNCF #
The CNCF is a non-profit foundation under the Linux Foundation that neutrally hosts and nurtures cloud native open source projects. It maintains vendor-neutral governance so that no single vendor can monopolize a project, reviews project maturity, and runs certification and education programs. The fact that Google was the one that originally donated Kubernetes to the CNCF occasionally shows up on the exam, too.
Project maturity levels #
CNCF projects are divided into three levels by maturity. Memorizing the order and the meaning is the key exam point in this area.
| Level | Meaning |
|---|---|
| Sandbox | Early, experimental stage. Low barrier to entry, still being validated |
| Incubating | Growth stage. Real production adoption and active contribution have been demonstrated |
| Graduated | Mature stage. The top tier, with broad adoption and stable governance |
The flow is Sandbox → Incubating → Graduated. As a project moves up, the bar for adoption rate, stability, and governance maturity rises. Representative examples of Graduated projects are as follows.
- Kubernetes. Container orchestration. The CNCF’s first graduated project.
- Prometheus. Metric collection and monitoring. The second graduated project.
- Envoy. A high-performance proxy and the data plane of the service mesh.
- Others such as containerd, Helm, etcd, Fluentd, ArgoCD, and OpenTelemetry have also reached the graduated level.
Open governance and KubeCon #
CNCF projects run on an open-governance model in which multiple organizations and contributors take part in decision-making rather than a single company. Maintainers, contributors, and the community set the roadmap through open processes. The flagship event where this community gathers is KubeCon + CloudNativeCon. Hosted by the CNCF, it is the largest conference in the cloud native ecosystem, where projects are presented and case studies are shared.
Open standards #
The reason cloud native isn’t locked into a particular vendor is thanks to open standards. When a standard interface is defined, you can freely swap implementations in and out while the layers above keep working unchanged. This is how you avoid vendor lock-in and secure interoperability.
The major standards that appear in KCNA are as follows.
| Standard | What it defines |
|---|---|
| OCI (Open Container Initiative) | The container image format and runtime specs. An image built by any tool runs on any runtime |
| CRI (Container Runtime Interface) | The standard between the Kubernetes kubelet and the container runtime. containerd and CRI-O are interchangeable |
| CNI (Container Network Interface) | The container networking plugin standard. Calico, Cilium, and others are interchangeable |
| CSI (Container Storage Interface) | The container storage plugin standard. Various storage backends are interchangeable |
| SMI (Service Mesh Interface) | A standard interface for working with a service mesh. Mesh configuration that is independent of the implementation |
| OpenTelemetry (OTel) | The standard for collecting and transmitting metric, log, and trace telemetry data |
The essence of a standard is the same: “the layer above knows only the standard interface, and you swap the implementation below it freely.” For example, Kubernetes only needs to know the CRI interface, so it works the same whether containerd or CRI-O sits behind it. The boundaries of CRI, CNI, and CSI were covered in detail in #4. The distinction that OCI defines the image and the runtime while CRI defines the connection between Kubernetes and the runtime is easy to confuse, so it is safer to keep them separate.
Rollout and deployment concepts #
Cloud native architecture presupposes a deployment model that replaces a service with a new version without stopping it. The method Kubernetes Deployments provide by default is the rolling update.
- Rolling update. Instead of terminating all existing Pods at once, it brings up new-version Pods one by one while taking down old-version Pods one by one. Because a certain number of Pods always serve traffic throughout this process, it makes for a zero-downtime deployment. If something goes wrong, a rollback to the previous version is also possible declaratively.
- Immutable infrastructure. Rather than fixing a running container in place, it replaces it wholesale with a new Pod built from a new image. Pods started from the same image are always in the same state, which reduces problems caused by environment differences and keeps rollbacks simple.
These two concepts are one package with cloud native’s self-healing, declarative model. Declare the desired state (three Pods of version v2) and Kubernetes converges the current state in that direction without downtime, automatically respinning any dead Pods along the way.
Exam point summary #
Here are the comparisons that most often decide points in this domain.
- HPA vs VPA vs CA. HPA is the Pod count (horizontal), VPA is the Pod resource size (vertical), and CA is the node count. KEDA is event-driven with scale-to-zero.
- Project maturity level order. Sandbox → Incubating → Graduated. As a project moves up, the bar for adoption rate and stability rises.
- Serverless definition. Not a model with no servers, but one where you don’t have to think about operating servers. Serverless on top of Kubernetes is Knative, and scale-to-zero is the key.
- Distinguishing the open standards. OCI (image and runtime), CRI (runtime connection), CNI (network), CSI (storage), OTel (telemetry). The purpose of a standard is avoiding vendor lock-in and interoperability.
- The nature of the CNCF. Vendor-neutral governance under the Linux Foundation. KubeCon is its flagship event.
Common traps #
- Hanging HPA and VPA on the same metric at once. If both use CPU and memory as their basis, you get a conflict where one grows the size while the other shrinks the count. If an option presents this combination as a normal configuration, it is a trap.
- Mistaking scale-to-zero for a built-in HPA feature. Default HPA keeps a minimum of one. Dropping to zero is a trait of KEDA or Knative.
- Reversing the maturity level order. Sandbox is the earliest, Graduated is the most mature. Options that pick Incubating as the highest level are mixed in often.
- Lumping OCI and CRI together as the same standard. OCI is the image/runtime format spec, while CRI is the interface that connects the Kubernetes kubelet and the runtime. They sit at different layers.
Wrap-up #
What we pinned down in this post:
- Cloud native. Containers, microservices, declarative APIs, immutable infrastructure, service mesh. Self-healing and resilience are the operational traits
- Autoscaling. The distinction among four layers — HPA (Pod horizontal), VPA (Pod vertical), CA (nodes), KEDA (events, scale-to-zero)
- Serverless. An execution model where you don’t think about infrastructure. Knative and FaaS, scale-to-zero
- Community. The CNCF’s vendor-neutral governance, the Sandbox → Incubating → Graduated maturity levels, KubeCon
- Open standards. OCI, CRI, CNI, CSI, SMI, OpenTelemetry. Avoiding vendor lock-in and interoperability
- Rollout. Zero-downtime deployment via rolling updates, wholesale replacement via immutable infrastructure
If Kubernetes is brand new to you, the shortcut to internalizing these concepts is to first get the feel of spinning up a cluster yourself in Practical Track #1.
Next: Cloud Native Observability #
We’ve laid out the design philosophy and the ecosystem. Now we move on to how to look inside a system while it’s running.
In #6 Cloud Native Observability (8%) — Telemetry, Prometheus, Cost Management we’ll cover the three pillars of telemetry (metrics, logs, traces), metric collection via Prometheus, SLI/SLO/SLA, and the concept of cost management (FinOps).