5 Chapter

Service

The abstraction that solves the problem of Pod IPs being temporary — the Service. A stable ClusterIP · selector · Endpoints / EndpointSlice, the criteria for choosing among the three types ClusterIP · NodePort · LoadBalancer, kube-proxy's DNAT, and CoreDNS's short-name resolution.

In Chapter 4 Deployment and ReplicaSet we got as far as the shape where 3 Pods stay up automatically. But the fact that those 3 IPs change each time nags at us. In this chapter we cover the abstraction that solves that problem, the Service. We organize, together, a stable virtual IP and DNS name, the set of backends selected by the selector, and the three exposure methods ClusterIP / NodePort / LoadBalancer.

By the end of this chapter you’ll have your first manifest that puts a stable entry point in front of Pods. You’ll see three patterns side by side: Pods calling each other by name inside the cluster, direct access from outside via a node port, and an external LB attaching automatically in a cloud environment. The difference between them is often just one line.

The limits of a Pod IP — why a Service is needed #

If you’ve followed to the end of Chapter 4, the picture in your head is this — 3 nginx Pods labeled app: web are up, each with a cluster-internal IP like 10.244.0.5, 10.244.0.6, 10.244.0.7. In this state you want to do one more thing — send an HTTP request to those 3 from another Pod in the same cluster, or open it once from your laptop browser.

But once you actually try, four problems hit at once.

Pod IPs are ephemeral. Once a Pod is recreated, a new IP is attached. The 10.244.0.5 you noted yesterday may be a nonexistent IP today. The path of pinning an IP in client code and calling it is closed from the start.
There’s no load balancing across the 3 Pods. If you pick one Pod IP and call it, only that Pod works while the other two idle. Someone has to spray traffic evenly across N Pods.
There’s no service discovery. From the client Pod’s point of view, where to keep asking “what’s the current IP of that web service” each time is ambiguous. You need a path to call by name, not IP.
There’s no external traffic entrance. Cluster-internal IPs aren’t visible from a laptop browser. A separate entrance must be prepared for letting something external flow into a Pod inside.

The abstraction that solves these four at once is the Service. Write one manifest and K8s gives it a stable virtual IP, uses that IP as a load balancer for the Pods selected by the selector, and automatically creates a DNS record so other Pods in the same cluster can call it by name.

Service — stable IP + selector + DNS #

We split the result one Service manifest produces into three.

A stable virtual IP (ClusterIP) — an IP that doesn’t change while the cluster is alive. The same IP is maintained regardless of Pods dying and living.
A Pod group selected by labels — Pods that match the spec.selector labels become that Service’s backends. A newly up Pod joins automatically if its labels match, and it is excluded automatically when it dies.
A DNS name — an FQDN of the form <svc>.<ns>.svc.cluster.local is created automatically. Within the same namespace you can call it by just the short name <svc>.

It helps to keep the picture in your head like this.

one Service and the Pods behind it

   ┌──────────────────────────────┐
   │   Service: web               │  selector: app=web
   │   ClusterIP: 10.96.x.x       │  DNS: web.default.svc.cluster.local
   └──────────────┬───────────────┘
                  │ distributes traffic
       ┌──────────┼──────────┐
       ▼          ▼          ▼
   ┌────────┐ ┌────────┐ ┌────────┐
   │ Pod-1  │ │ Pod-2  │ │ Pod-3  │  app=web
   │.0.5    │ │.0.6    │ │.0.7    │  (Pod IPs are temporary)
   └────────┘ └────────┘ └────────┘

The key in the picture above is — the client only needs to look at the Service IP or name in the middle, and K8s updates the dying and living of the Pods below on its own. The IP the Service holds is stable, and the Pod IPs behind it are temporary. The two must be separated for zero-downtime operation.

Endpoints / EndpointSlice — the result of the selector #

The list of IPs · ports of the Pods the Service’s selector matched is organized by K8s into a separate object. This object is Endpoints (or EndpointSlice, recommended from 1.21+). A person rarely creates it directly; when you create a Service, K8s fills it automatically.

see the Service's backend list

kubectl get endpoints web

example output

NAME   ENDPOINTS                                     AGE
web    10.244.0.5:80,10.244.0.6:80,10.244.0.7:80     30s

The ENDPOINTS column lists the Pod IPs directly. When one Pod dies it soon disappears from this list, and a newly started Pod joins automatically if it matches the labels.

From 1.21+, EndpointSlice is recommended. It was introduced to solve the problem of one object becoming too bloated when a Service’s backends grow. There’s no big difference, and from a user’s standpoint you can see both with kubectl get.

EndpointSlice too

kubectl get endpointslices -l kubernetes.io/service-name=web

example output

NAME         ADDRESSTYPE   PORTS   ENDPOINTS                          AGE
web-abc12    IPv4          80      10.244.0.5,10.244.0.6,10.244.0.7   30s

This object is the first starting point of Service debugging. When the symptom “traffic doesn’t seem to reach the Service” appears, here is the first place you look.

first, whether it's empty

kubectl get endpoints web

example output when empty

NAME   ENDPOINTS   AGE
web    <none>      1m

If ENDPOINTS is empty, it means the Service’s selector matches no Pod. It’s one of two things — the selector label has a typo, or there’s no Pod to match in that namespace. Check the actual Pods’ labels with kubectl get pods --show-labels and align them against the selector, and the answer comes out. The finished version of the diagnostic tree is organized in Chapter 27 kubectl debugging patterns.

ClusterIP — cluster-internal only #

Let’s start with the most-used default type. If you don’t write the Service’s spec.type, it’s automatically ClusterIP. It’s the shape that grabs a virtual IP reachable only inside the cluster.

Assuming the app: web Deployment we brought up in Chapter 4 is still up, let’s attach a Service in front of it. Name the file web-svc.yaml.

web-svc.yaml

apiVersion: v1
kind: Service
metadata:
  name: web
spec:
  type: ClusterIP
  selector:
    app: web
  ports:
    - port: 80
      targetPort: 80

The spine of the manifest is the four fields we saw in Chapter 3 — apiVersion / kind / metadata / spec. Just note that a Service is the core group’s v1, not apps/v1. It’s often confused with Deployment.

There are three new parts inside spec.

type — one of ClusterIP / NodePort / LoadBalancer / ExternalName. If you don’t write it, it’s ClusterIP.
selector — decides which labeled Pods to grab as backends. Above we set it to app: web. The key is to keep it matching the Deployment template labels from Chapter 4.
ports — a list of port mappings. One Service can expose several ports at once, or you can write just one line as above.

port vs targetPort #

Let’s pin the two fields under ports in one line.

port — the port the Service listens on. Where the client knocks. With the manifest above, you come in on web:80.
targetPort — the port the backend Pod’s container listens on. Since the nginx container listens on 80, it’s 80.

The two being the same number makes it easy to confuse, but there’s a reason to keep them separate. For example, if you want the container to listen on 8080 and expose the Service on the standard 80, write them differently as port: 80, targetPort: 8080. This separation lets a Service also act as a kind of lightweight port-mapping layer.

apply and checking the result #

Reflect the manifest into the cluster.

create the Service

kubectl apply -f web-svc.yaml

example output

service/web created

list Services

kubectl get svc

example output

NAME         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.96.0.1       <none>        443/TCP   2d
web          ClusterIP   10.96.142.31    <none>        80/TCP    10s

The column names are familiar: NAME / TYPE / CLUSTER-IP / EXTERNAL-IP / PORT(S) / AGE. You will see them often through the rest of the book. The kubernetes line is a Service the cluster holds for its own API server, so you don’t need to worry about it. The new one is the single web line. CLUSTER-IP 10.96.142.31 was assigned, and EXTERNAL-IP is <none> — meaning reachable only inside the cluster.

(The 10.96.0.0/12 range for IPs is the kubeadm default service CIDR. It can be grabbed differently per environment. minikube · kind are similar, and managed ones like EKS · GKE have their own defaults.)

Calling it from inside the cluster #

The core verification of ClusterIP is whether another Pod can call this Service. Let’s bring up one temporary debug Pod and knock once with curl inside it.

bring up a temp curl Pod and go in

kubectl run tmp --rm -it --image=curlimages/curl -- sh

--rm is the option to auto-delete the Pod on exit, and -it is interactive + TTY. Go in and call it in three shapes.

inside the temp Pod

/ $ curl -s http://web | head -1
<!DOCTYPE html>
/ $ curl -s http://web.default.svc.cluster.local | head -1
<!DOCTYPE html>
/ $ curl -s http://10.96.142.31 | head -1
<!DOCTYPE html>

All three paths point to the same place.

Short name web — within the same namespace (default) you reach it by just the Service name. The most-used form.
FQDN web.default.svc.cluster.local — the formal name used when calling a Service in another namespace, or when you want to remove ambiguity.
ClusterIP 10.96.142.31 — you can knock the virtual IP directly, but you’ll rarely memorize this IP. Calling by DNS is the proper way.

Knocking the same command several times gives the same nginx welcome page each time, but in reality K8s is picking one of the 3 backend Pods per request and routing it there. Load balancing is the default behavior even without separate configuration. If you want to confirm which Pod actually responded, open the nginx access log once — you can see requests spread evenly across the three Pods’ logs.

When you exit the temp Pod with exit, it’s cleaned up automatically thanks to --rm. In operations, cluster-internal communication is almost always this ClusterIP shape. Backend ↔ DB, backend ↔ Redis, calls between microservices — all bundled with ClusterIP.

NodePort — expose on a specific port of a node IP #

We said ClusterIP is reachable only inside the cluster. The simplest way to make it reachable from outside is NodePort. It opens the same port (default range 30000 ~ 32767) on every node of the cluster and forwards traffic on that port to the same Service.

The manifest just needs two more lines on top of ClusterIP.

web-svc.yaml — NodePort version

apiVersion: v1
kind: Service
metadata:
  name: web
spec:
  type: NodePort
  selector:
    app: web
  ports:
    - port: 80
      targetPort: 80
      nodePort: 30080

We changed it to type: NodePort and added nodePort: 30080 under ports. If you don’t write nodePort, K8s auto-picks one from the 30000 ~ 32767 range. When you write it directly, it must be a value within that range.

apply again

kubectl apply -f web-svc.yaml

example output

service/web configured

look again

kubectl get svc web

example output

NAME   TYPE       CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
web    NodePort   10.96.142.31    <none>        80:30080/TCP   5m

Two parts changed — TYPE to NodePort, and PORT(S) to 80:30080/TCP. The front 80 is the Service’s port (the port you knock inside the cluster), and the back 30080 is the node’s NodePort. Now inside the cluster you still reach it on web:80, and outside the cluster you reach it on <NodeIP>:30080.

from outside, via a node IP

curl http://<NodeIP>:30080

In the <NodeIP> part you put a worker node’s external IP. The shape differs slightly per local environment.

kind — the node is inside a docker container, so it isn’t directly reachable from the host. Expose 30080 to the host side with extraPortMappings when creating the cluster, or work around it with kubectl port-forward.
minikube — you can get the access URL with minikube service web --url.
Docker Desktop k8s — the node = the host itself, so you reach it directly on localhost:30080.

In operations, exposing NodePort directly to clients is rare. The port number is awkward in the 30000s, and external clients have to follow the IP list as nodes are added / removed. Usually it’s the shape where a LoadBalancer or Ingress sits on top and uses NodePort inside. NodePort itself is useful for quickly confirming external access in local development, or for opening briefly for debugging.

LoadBalancer — integration with a cloud LB #

The most common shape for external exposure in operations is LoadBalancer. Writing the one line type: LoadBalancer makes K8s ask the cloud provider (AWS ELB, GCP LB, Azure LB, etc.) to automatically create an external LB. The created LB’s external IP fills the Service’s EXTERNAL-IP column.

web-svc.yaml — LoadBalancer version

apiVersion: v1
kind: Service
metadata:
  name: web
spec:
  type: LoadBalancer
  selector:
    app: web
  ports:
    - port: 80
      targetPort: 80

apply again

kubectl apply -f web-svc.yaml

In a cloud environment #

Applying the manifest above on a managed cluster like EKS · GKE · AKS usually creates an external LB within 1 ~ 2 minutes.

while it's being created

kubectl get svc web

example output — the first minute

NAME   TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
web    LoadBalancer   10.96.142.31    <pending>     80:31523/TCP   20s

example output — after the LB attaches

NAME   TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)        AGE
web    LoadBalancer   10.96.142.31    a1b2c3d4.elb..   80:31523/TCP   2m

EXTERNAL-IP changes from <pending> to a real IP / DNS name. That address is the external entry point. On AWS it’s an ELB DNS name, on GCP an IP address — the form differs slightly per environment. It’s interesting that NodePort 31523 is also shown in PORT(S) — inside LoadBalancer it auto-grabs a NodePort, and the cloud LB sends traffic to that NodePort. So LoadBalancer is a higher-level concept than NodePort.

We cover this shape in Part 4 EKS in Production with ALB (Chapter 22 app deployment skeleton).

In local · on-premise environments #

On kind, standalone minikube, or a regular bare-metal cluster without a cloud controller, applying the manifest above leaves EXTERNAL-IP stuck at <pending> forever. There’s no one to create an external LB.

example output locally

NAME   TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
web    LoadBalancer   10.96.142.31    <pending>     80:31523/TCP   5m

Tools that came out to fill this gap are MetalLB (for bare metal), cloud-provider-kind (kind-only), and the like. Install one and that tool acts like a cloud controller and fills in EXTERNAL-IP. We’ll just pin the names and leave deep installation outside this chapter’s scope.

The point is simple — the external entry point in operations is almost always LoadBalancer or the Ingress on top of it. Ingress is a higher-level abstraction that routes several Services by host · path behind one LoadBalancer, and it’s covered in Chapter 10 Ingress and Ingress Controller. In this chapter LoadBalancer is the endpoint.

Service types in one table #

Let’s organize the three so far and two commonly met into one table.

Type	External exposure	Main use
`ClusterIP` (default)	none (cluster-internal only)	backend ↔ DB, communication between microservices
`NodePort`	`<NodeIP>:<30000 ~ 32767>`	local development, external access for debugging, the inner implementation of an LB
`LoadBalancer`	the cloud LB’s external IP / DNS	the production external entry point. Needs a cloud, MetalLB, etc.
`ExternalName`	none (DNS CNAME only)	aliasing a cluster-internal name to an external domain
Headless (`clusterIP: None`)	none (no virtual IP)	when individual Pod IPs are needed, like with StatefulSet

Let’s pin the last two lines one at a time.

ExternalName — writing type: ExternalName + externalName: db.example.com in the manifest makes K8s’s internal DNS respond with a CNAME to the external domain when you call <svc>.<ns>.svc.cluster.local. It handles a special shape with no selector and no backend Pods. Used when you want to call an external system by a cluster-internal name.
Headless Service — writing spec.clusterIP: None grabs no virtual IP and returns the backend Pod IPs directly on DNS lookup. It’s the counterpart for cases where the client must reach each Pod directly, like Chapter 8 StatefulSet. Rarely used in a regular web service.

kube-proxy — so who lets the traffic flow #

If you’ve followed this far, one thing slightly nags — the Service’s ClusterIP 10.96.142.31 is an IP not actually attached to any node. Dig into any node with ip addr and that IP isn’t there. Yet sending a packet to that IP from inside a Pod arrives somewhere. Who lets it flow?

The answer is a system component called kube-proxy running on each node. It’s a daemon that already appeared once as a worker-node component in Chapter 1.

kube-proxy's job

   Pod ─▶ 10.96.142.31:80 (virtual IP)
              │
              ▼  DNAT via iptables/IPVS rules
              │
   one of the three Pod IPs ─▶ 10.244.0.5:80
                              10.244.0.6:80
                              10.244.0.7:80

kube-proxy watches Endpoints / EndpointSlice and automatically applies the node’s iptables (or IPVS) rules. The rule says “a packet headed for 10.96.142.31:80 is DNAT’d to one of 10.244.0.5:80, .0.6:80, .0.7:80.” A packet a Pod sent to the Service IP is caught by this rule before leaving the node and changed to the actual Pod IP.

So a Service is not an LB on any one node but a virtual LB distributed across all nodes. The same rule is applied on every node, so calling the same ClusterIP works equally well no matter which node a Pod is on. kube-proxy’s mode is usually iptables (default) or ipvs, and deeper behavior and the eBPF-based alternative (Cilium, etc.) are covered in Chapter 15 CNI in depth.

DNS — CoreDNS and the service name #

Let’s pin down in one section how a short name like web resolves to a ClusterIP. In the cluster’s kube-system namespace there’s a DNS server called CoreDNS (usually as two Pods). CoreDNS automatically creates an A record for every Service.

The default domain is cluster.local, and the FQDN is <svc>.<ns>.svc.cluster.local. Within the same namespace, writing just the short name <svc> resolves because the search domain is appended on its own.

check DNS inside the temp Pod

nslookup web
# Server:    10.96.0.10
# Address:   10.96.0.10#53
#
# Name:      web.default.svc.cluster.local
# Address:   10.96.142.31

The key is that the response IP is the same ClusterIP we saw. A Pod’s /etc/resolv.conf is filled by K8s automatically — nameserver has CoreDNS’s ClusterIP (a value like 10.96.0.10), and search has <ns>.svc.cluster.local svc.cluster.local cluster.local. So a short name automatically expands to the formal name.

The default domain cluster.local is changeable (an option at cluster install). But since almost every environment uses the default unchanged, assuming cluster.local when writing in manifests or code is fine.

Cleanup and teardown #

Wipe clean both the Service we made today and the Deployment that was up from Chapter 4.

clean up the Service

kubectl delete -f web-svc.yaml

example output

service "web" deleted

and the Deployment (if any)

kubectl delete deploy web

example output

deployment.apps "web" deleted

Confirm it’s empty with kubectl get svc,deploy,pods and you’re back at the starting point. It’s normal for only the kubernetes Service line to remain — that’s a Service the cluster holds for itself, not something for us to delete.

Exercises #

As in the body, change web-svc.yaml’s type in the order ClusterIP → NodePort → LoadBalancer, doing kubectl apply each time. Organize into a table how the TYPE / CLUSTER-IP / EXTERNAL-IP / PORT(S) columns change in each step’s kubectl get svc web output. Note in a paragraph where the difference between environments (local vs managed cloud) splits, depending on whether EXTERNAL-IP stays at <pending> or changes to a real address in the LoadBalancer step.
Deliberately change the Service’s spec.selector label by one character (e.g., app: web → app: webb). Record how the ENDPOINTS column of kubectl get endpoints web changes, and what error appears when you knock curl http://web inside kubectl run tmp --rm -it --image=curlimages/curl -- sh. Organize how the debugging starting point of §“Endpoints / EndpointSlice — the result of the selector” applies.
Inside the temp curl Pod, knock the three in turn — nslookup web, nslookup web.default.svc.cluster.local, nslookup kubernetes.default.svc.cluster.local. Organize into a table where each response IP came from (CoreDNS · ClusterIP · the kubernetes system Service), and note in a paragraph how the search domain list in the Pod’s /etc/resolv.conf acts on expanding a short name.

In one line: a Service is the abstraction that solves the limit of temporary Pod IPs, providing a stable ClusterIP, the Pod group selected by labels, and a CoreDNS auto A record as one set. External exposure splits into two — NodePort (a node port) and LoadBalancer (a cloud LB) — and the actual traffic is DNAT’d to Pod IPs by each node’s kube-proxy with iptables · IPVS.

Next chapter #

Even this far, one thing still sits awkwardly inside the manifest — values like the image tag, port, and domain are written directly in the manifest. The next topic is peeling out of the manifest body the values that should differ per environment (dev / staging / prod), and the values that shouldn’t be left in plaintext in a manifest, like passwords.

In Chapter 6 ConfigMap and Secret we cover gathering config values in a ConfigMap and injecting them into a Pod as environment variables · volumes, what makes a Secret different from a ConfigMap (and the key point that base64 isn’t encryption), and the flow of peeling a set of config values out of this chapter’s web Deployment into an external object. Production secret operations (sealed-secrets · external-secrets · IRSA) are covered in Chapter 29 secret operations.