Certified Kubernetes Administrator (CKA) #16 Storage 1: Volume Types, PV, PVC, and Static Provisioning

If #15 Resource Management covered how a Pod requests and is limited on CPU and memory, this post starts on data. The filesystem inside a container is temporary space that disappears along with the container when it dies, so data that must outlive a Pod’s lifecycle — DB data, user uploads — needs a separate storage model. Kubernetes expresses this with three abstractions: Volume, PersistentVolume, and PersistentVolumeClaim.

This post is the first installment of the Storage domain, organizing Volume types and the static provisioning of PVs and PVCs. Dynamic provisioning and StorageClass follow in #17. Storage carries only a 10% exam weight, so it’s not large, but the binding rules between a PV and a PVC will land you in Pending if even a single character is off — so it’s safer to make it second nature.

The two problems Volume solves #

The filesystem a container in a Pod uses has two limitations. First, when the container restarts, every file written in the meantime is gone. Second, multiple containers in one Pod have no way to share the same file. Volume is the abstraction that solves both at once. A Volume is declared in the Pod’s spec.volumes and mounted at a specific path through each container’s volumeMounts.

That said, a Volume’s lifetime varies by type. Some Volumes disappear when the Pod is gone; others survive independently of the Pod. This difference is exactly what divides the Volume types.

Volume types #

The Volumes you need to know for CKA fall into roughly four groups.

TypeLifetimeUse
emptyDirSame as the PodTemporary sharing between containers in a Pod, cache/scratch
hostPathTied to the node diskMount a specific path on the node (single node/testing)
configMap / secretSame as the Pod (source is a separate object)Inject config/secrets as files
persistentVolumeClaimLifetime of the PVC/PVTrue persistent storage

emptyDir: temporary space for the Pod’s lifetime #

emptyDir is created as an empty directory when the Pod is scheduled onto a node, and it disappears together with the Pod when the Pod is removed from the node. It’s most commonly used when two containers in the same Pod exchange files.

apiVersion: v1
kind: Pod
metadata:
  name: emptydir-demo
spec:
  containers:
    - name: writer
      image: busybox
      command: ["sh", "-c", "echo hello > /cache/data; sleep 3600"]
      volumeMounts:
        - name: scratch
          mountPath: /cache
    - name: reader
      image: busybox
      command: ["sh", "-c", "sleep 3600"]
      volumeMounts:
        - name: scratch
          mountPath: /shared
  volumes:
    - name: scratch
      emptyDir: {}

The two containers mount a Volume named scratch at different paths. The reader reads from /shared the file the writer wrote to /cache. When the Pod dies, this data is gone.

hostPath: mounting the node disk directly #

hostPath mounts a specific path on the node into the Pod. Since the data stays on the node disk, it survives Pod restarts, but if the Pod moves to a different node it can no longer reach that data. That’s why it’s rarely used for ordinary workloads in a production cluster, and is limited to single-node environments or system components that read the node’s own logs or sockets.

apiVersion: v1
kind: Pod
metadata:
  name: hostpath-demo
spec:
  containers:
    - name: app
      image: busybox
      command: ["sh", "-c", "sleep 3600"]
      volumeMounts:
        - name: node-data
          mountPath: /data
  volumes:
    - name: node-data
      hostPath:
        path: /var/lib/node-data
        type: DirectoryOrCreate

configMap and secret: config as files #

configMap and secret are ways of mounting the objects covered in #12 as a Volume. You use them when you want to inject configuration as files instead of environment variables. Here the Volume’s data itself lives in the ConfigMap/Secret object, and the Volume merely projects it into a container path.

PersistentVolume: a piece of storage the cluster owns #

emptyDir and hostPath are both tied to a node, so they can’t give true persistence. To guarantee access to the same data regardless of which node a Pod lands on, you need storage that lives outside the node, and Kubernetes represents one piece of that storage as a cluster-level object called a PersistentVolume (PV). A PV does not belong to a namespace, and it’s either created ahead of time by an admin or produced dynamically by a StorageClass.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-nfs-5g
spec:
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  storageClassName: manual
  nfs:
    server: 10.0.0.10
    path: /exports/data

capacity and accessModes #

capacity.storage is the capacity this PV provides. accessModes declares how this PV can be mounted, and there are four of them.

accessModeAbbreviationMeaning
ReadWriteOnceRWORead/write from one node
ReadOnlyManyROXRead-only from many nodes
ReadWriteManyRWXRead/write from many nodes
ReadWriteOncePodRWOPRead/write by a single Pod only

Here the “Once” in RWO means one node, not one Pod — a point that comes up often on the exam. Multiple Pods running on a single node can mount an RWO PV simultaneously. To truly allow only one Pod, you must use RWOP. And which accessMode you can actually use is decided by the storage type. NFS supports RWX, while a simple block-device-based backend supports only RWO.

persistentVolumeReclaimPolicy #

persistentVolumeReclaimPolicy determines what to do with a PV once the PVC is deleted and the PV is released.

PolicyBehavior
RetainPreserves the PV and leaves it in the Released state. Data is kept; reuse is handled manually by the admin
DeleteDeletes the PV together with its backing storage
Recycle(Deprecated) Wipes the data and reuses it. Not used today

A PV created by static provisioning usually uses Retain to protect its data. A PV released under Retain enters the Released state, and in that state it won’t be automatically bound to a new PVC. To use it again, the admin must clear the PV’s claimRef, clean up the data, and return it to Available.

PersistentVolumeClaim: the user’s request #

If a PV is the supply side — the cluster saying “this storage is available” — a PersistentVolumeClaim (PVC) is the demand side: a user asking for storage that meets certain conditions. A PVC belongs to a namespace, and a Pod never references a PV directly — it always goes through a PVC. Thanks to this separation, an app developer can request storage by writing only the capacity and access mode, without knowing the actual storage type.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc-data
  namespace: default
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 5Gi
  storageClassName: manual

The rules for binding a PVC to a PV #

When you create a PVC, the controller finds a PV in the Available state that matches the conditions and binds it one-to-one. For the binding to hold, all three of the following must match.

  1. Capacity. The PV’s capacity.storage must be at least the PVC’s requests.storage. If the PVC requests 5Gi but the PV is 3Gi, it won’t bind. Conversely, if the PV is larger, it binds, but the leftover capacity is wasted (a static PV isn’t split).
  2. accessModes. The PV must support every accessMode the PVC requested.
  3. storageClassName. The PVC’s and the PV’s storageClassName must be the same. In static provisioning, you either write the same name like manual on both sides, or leave both empty.

If even one of these is off, the PVC stays Pending. On the exam, when a PVC won’t move to Bound, comparing the three items above side by side with the PV is the fastest diagnosis. You can also narrow it down with a selector (spec.selector) so it accepts only PVs with a specific label.

# Check the status of the PVC and PV side by side
k get pvc,pv

# Check the reason when binding fails
k describe pvc pvc-data

Mounting a PVC in a Pod #

Once a PVC is Bound to a PV, the Pod references that PVC as a Volume and mounts it. From the Pod’s point of view, it doesn’t need to care whether NFS or a cloud disk sits behind it — it just writes the PVC name.

apiVersion: v1
kind: Pod
metadata:
  name: app-with-pvc
spec:
  containers:
    - name: app
      image: nginx
      volumeMounts:
        - name: data
          mountPath: /usr/share/nginx/html
  volumes:
    - name: data
      persistentVolumeClaim:
        claimName: pvc-data

The persistentVolumeClaim.claimName under volumes points to the PVC created earlier. If this PVC is still Pending, the Pod can’t be scheduled either and gets stuck in ContainerCreating or Pending. In other words, the Pod’s startup depends on the PVC’s binding, and the PVC’s binding depends on the existence of a matching PV, in that chain.

Static provisioning: the admin creates PVs ahead of time #

The flow so far is exactly static provisioning. The admin prepares storage in advance and registers it as PV objects, and the user requests one with a PVC. The PV exists first, and the PVC picks one of them — that’s the order.

Admin: create PV  →  PV(Available)
                       ↑ binding (capacity,accessModes,SC match)
User: create PVC  →  PVC  →  Pod mounts the PVC

Static provisioning suits environments with predetermined storage like an NFS server, or environments where the admin wants to control which team uses which storage. Its drawbacks are clear too. Every time a new request comes in, the admin has to create a PV by hand, and if the capacity doesn’t match exactly, space is wasted. Removing this manual work is the dynamic provisioning covered in #17.

The whole flow in one go #

Let’s run one cycle of static provisioning by tying together a PV, a PVC, and a Pod.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-manual-1g
spec:
  capacity:
    storage: 1Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: manual
  hostPath:
    path: /mnt/data
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc-manual
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 500Mi
  storageClassName: manual
---
apiVersion: v1
kind: Pod
metadata:
  name: pv-consumer
spec:
  containers:
    - name: app
      image: nginx
      volumeMounts:
        - name: store
          mountPath: /data
  volumes:
    - name: store
      persistentVolumeClaim:
        claimName: pvc-manual

The PVC requested 500Mi but binds to the 1Gi PV (the PV’s capacity just needs to be at least the request). The accessModes and storageClass (manual) match on both sides, so the binding succeeds. Let’s check the status after applying.

k apply -f static.yaml
k get pv pv-manual-1g       # STATUS Bound, CLAIM is default/pvc-manual
k get pvc pvc-manual        # STATUS Bound, VOLUME is pv-manual-1g
k get pod pv-consumer       # STATUS Running

Exam points #

Here are the spots that decide your score on CKA Storage tasks.

  • RWO is per node. Don’t confuse “Once” with a Pod — it means a node. To allow only one Pod, it’s RWOP.
  • The 3 binding conditions. When a PVC is Pending, compare the three — capacity (PV ≥ PVC), accessModes, storageClassName — side by side with the PV. If even one is off, it won’t bind.
  • storageClassName match. In static provisioning, write the PV’s and PVC’s storageClassName the same, or leave both empty. Leaving only one empty behaves contrary to expectations.
  • reclaimPolicy Retain. A PV released under Retain stays in the Released state and isn’t automatically rebound. If preserving data is the goal, it’s Retain.
  • A Pod references only a PVC. A Pod doesn’t mount a PV directly. It always goes through a PVC.
  • Quick diagnosis commands. Make it second nature to view status side by side with k get pvc,pv and check the Pending reason with k describe pvc.

There’s a post that unpacks the same topic from an app developer’s angle: K8s Intermediate #2 PV / PVC / StorageClass. If the model isn’t clicking, reading it alongside helps.

Wrap-up #

What this post locked in:

  • Volume types. emptyDir (temporary space for the Pod’s lifetime), hostPath (node disk), configMap/secret (config injection), persistentVolumeClaim (persistent storage)
  • PV. capacity, accessModes (RWO/ROX/RWX/RWOP), persistentVolumeReclaimPolicy (Retain/Delete). A cluster-level object that doesn’t belong to a namespace
  • PVC. A namespaced object that requests capacity and access mode. A Pod references only a PVC
  • Binding rules. All three conditions — capacity (PV ≥ PVC), accessModes, storageClassName — must match for Bound
  • Static provisioning. The admin creates PVs ahead of time and the user picks one with a PVC. Manual work and wasted capacity are its limits

Next: Storage 2 #

Static provisioning carried the burden of the admin creating PVs one by one. In #17 Storage 2, we’ll cover StorageClass and dynamic provisioning, which remove this manual work. We’ll continue through the flow where creating a PVC produces a PV automatically, how reclaim policy works in a dynamic environment, the expansion that grows PVC capacity with allowVolumeExpansion, and how volumeBindingMode’s WaitForFirstConsumer aligns scheduling with storage location.

X