Contents
13 Chapter

ALB / NLB and ACM (HTTPS)

The role differences among AWS's managed load balancers ALB / NLB / GWLB, the flow of Listener / Target Group / Health Check, and the operational flow of issuing a certificate with ACM and adding HTTPS in one go.

Where the domain of Chapter 12 Route 53 points, there is almost always a load balancer. AWS’s managed load balancers are collectively called ELB (Elastic Load Balancing), and within it there are three kinds: ALB / NLB / GWLB.

In this chapter we start from the differences among the three LBs and lay out the ALB’s Listener / Target Group / Health Check flow, and the process of receiving a certificate with ACM to add HTTPS, at a glance. The SG patterns covered here are an extension of Chapter 9 EC2 Operations; the pattern of using an ALB as an Origin and ACM’s region rule carry into Chapter 14 CloudFront; and the Target Group’s ip type reappears as automatic registration in Chapter 15 ECS and Fargate.

The problem a load balancer solves #

If you plug a domain directly onto a single EC2, the following break.

ProblemThe function the LB solves it with
If the EC2 dies, the site goes downHealth check removes the dead instance and sends to the live one
Traffic grows, one machine isn’t enoughDistributes load across multiple instances
Total outage on an AZ failureMulti-AZ distribution
Managing HTTPS certificates per instanceThe LB terminates it once (TLS termination)
Canary / Blue-Green deploymentDistributes by ratio with Listener Rules

In operations it’s almost always the Internet → ALB / NLB → EC2 / ECS / Lambda pattern.

ALB / NLB / GWLB — comparing the three #

ALB (Application LB)NLB (Network LB)GWLB (Gateway LB)
OSI layerL7 (HTTP / HTTPS)L4 (TCP / UDP / TLS)L3/L4 (IP)
Routing methodpath / host / header / methodport onlypacket as-is
ThroughputGoodVery fast (millions of RPS)Fast where appropriate
WebSocketSupportedSupported-
HTTP/2SupportedSupported (TLS)-
Static IPNone (DNS)Yes (EIP per AZ)-
WAF integrationSupportedNo-
Cognito authSupportedNo-
Lambda targetSupportedNo-
UseWeb / APIGames / IoT / TCP / gRPCSecurity appliances (Firewall)

A decision guide #

LB decision tree
HTTP(S) ?
├── YES → ALB
│   └── very high RPS (hundreds of thousands+)? → ALB → NLB at the limit
└── NO →
    TCP/UDP ?
    ├── YES → NLB
    └── pass through an external security appliance → GWLB

General web workloads are almost always ALB. NLB is used for games / IoT / very high throughput / when you need a static IP.

The structure of an ALB #

The structure of an ALB
                    Route 53
                   ┌────────┐
                   │  ALB   │
                   │        │
                   └─┬──┬───┘
                     │  │
                Listener (port 443)
                     │  │
                     ▼  ▼
                Listener Rules
                  (path / host)
                     ├── /api/*  ────▶ Target Group A (api EC2s)
                     ├── /admin/* ────▶ Target Group B (admin)
                     └── default /  ────▶ Target Group C (web)
                                          ├── EC2 #1 (AZ a)
                                          ├── EC2 #2 (AZ b)
                                          └── EC2 #3 (AZ a)

The core components are as follows.

Listener — the receiving port #

Sets which port the ALB receives traffic on. Usually 80 (HTTP), 443 (HTTPS).

Creating an ALB + Listener
aws elbv2 create-load-balancer \
  --name my-alb \
  --subnets subnet-pubA subnet-pubB \
  --security-groups sg-alb \
  --scheme internet-facing \
  --type application

aws elbv2 create-listener \
  --load-balancer-arn <alb-arn> \
  --protocol HTTPS \
  --port 443 \
  --certificates CertificateArn=<acm-arn> \
  --default-actions Type=forward,TargetGroupArn=<tg-arn>

Target Group — the targets to send to #

A bundle of instances / IPs / Lambdas the Listener sends traffic to. A group with a port and protocol.

Creating a Target Group
aws elbv2 create-target-group \
  --name my-app-tg \
  --protocol HTTP \
  --port 8080 \
  --vpc-id vpc-... \
  --target-type instance \
  --health-check-protocol HTTP \
  --health-check-path /health \
  --health-check-interval-seconds 30 \
  --healthy-threshold-count 2 \
  --unhealthy-threshold-count 3

The meaning of target-type is as follows.

TypeRole
instanceAn EC2 instance ID. Smooth for SG specification
ipAn arbitrary IP (inside the VPC). ECS / Fargate’s automatic registration
lambdaA Lambda function. The ALB → Lambda pattern

Listener Rule — the routing rule #

Within the same Listener, send to a different Target Group by path / host / header.

The shape of Listener Rules
Priority  Condition                      Action
10        host = api.example.com         forward → tg-api
20        path = /admin/*                forward → tg-admin
30        path = /static/*               redirect → cloudfront.example.com
default   *                              forward → tg-web

Rules are evaluated in priority order (smaller first), and once matched, it stops.

The Listener Rule actions are as follows.

  • forward — send to a Target Group (distribution across several places by weight is also possible).
  • redirect — send to a different URL (the standard for an HTTP → HTTPS permanent redirect).
  • fixed-response — give a fixed response (a maintenance page).
  • authenticate-cognito / -oidc — pass through after authenticating the user.
The standard for an HTTP → HTTPS redirect
Listener (port 80)
  default action: redirect → HTTPS://#{host}#{path}#{query} (301)

Listener (port 443)
  default action: forward → tg-web

This pattern is the operational standard. Send all port 80 traffic permanently to 443.

Health Check — only live targets #

The Target Group’s Health Check automatically removes dead instances and puts them back when they revive.

The flow of a Health Check
ALB ─ HTTP GET /health ──▶ EC2:8080
                       response 200 ──▶ healthy (3 in a row)
                       response 5xx ──▶ unhealthy (3 in a row) → excluded from routing

The frequently used options are as follows.

OptionRole
HealthCheckPathA light path like /health, /healthz
HealthCheckProtocolHTTP / HTTPS
HealthCheckIntervalSeconds30 (usual)
HealthyThresholdCount2~3 consecutive 200s → healthy
UnhealthyThresholdCount2~3 consecutive failures → unhealthy
Matcher.HttpCode200 or 200-299

Designing the Health Check path #

A good /health response is light.

Example of a light health endpoint (FastAPI)
@app.get("/health")
def health():
    return {"status": "ok"}

A deep health that even checks the DB goes on a separate path.

deep health
@app.get("/health/deep")
def deep_health(db: Session = Depends(get_db)):
    db.execute("SELECT 1")
    return {"status": "ok", "db": "ok"}

/health always light, /health/deep used only in monitoring or debugging. If the ALB polls deep health, the DB load is heavy.

Sticky Session — to the same target #

An option that always sends a particular user’s requests to the same instance.

Turning on Stickiness
aws elbv2 modify-target-group-attributes \
  --target-group-arn <tg-arn> \
  --attributes \
    Key=stickiness.enabled,Value=true \
    Key=stickiness.type,Value=lb_cookie \
    Key=stickiness.lb_cookie.duration_seconds,Value=86400

The cases where you use it are old apps that keep sessions in memory (which you should rarely build, but temporary during migration) or long-lived connections like WebSocket.

In operations the standard is to move the session outside (Redis / DB) and keep it stateless. Use Stickiness only in exceptional cases.

The use of NLB #

You use NLB for jobs that ALB can’t solve.

NLB’s strengths #

  • Static IP (EIP per AZ) — useful for firewall allowlists.
  • Millions of RPS handling.
  • TLS termination is also possible — NLB terminates TLS and sends plaintext to the backend.
  • The role of PrivateLink — exposes a service to another VPC.

NLB’s weaknesses #

  • L4 only — no path / host / header routing.
  • WAF integration isn’t direct.
  • A single Listener is one Target Group.
NLB + TLS Listener
aws elbv2 create-load-balancer \
  --name my-nlb --type network \
  --subnets subnet-pubA subnet-pubB

aws elbv2 create-listener \
  --load-balancer-arn <nlb-arn> \
  --protocol TLS --port 443 \
  --certificates CertificateArn=<acm-arn> \
  --default-actions Type=forward,TargetGroupArn=<tg-arn>

ACM — certificate issuance #

ACM (AWS Certificate Manager) issues, for free, and auto-renews the public SSL certificates used by ALB / NLB / CloudFront / API Gateway inside AWS.

Requesting a certificate #

Requesting an ACM certificate
aws acm request-certificate \
  --domain-name example.com \
  --subject-alternative-names "*.example.com" "api.example.com" \
  --validation-method DNS \
  --region ap-northeast-2

There are two validation methods.

  • DNS validation (recommended) — auto-validates with a CNAME in Route 53, and auto-renews.
  • Email validation (the old way) — sends mail to admin@example.com, etc., and is manual every year.

You almost always use DNS validation. If you use Route 53 alongside it, the console handles it automatically with a “Create record in Route 53” button.

The certificate’s region-match rule #

ACM certificates are per-region. If the ALB is in Seoul, you issue the certificate in Seoul too. However, CloudFront is always us-east-1 (Chapter 14).

Region mapping
ALB (Seoul)        → ACM certificate (ap-northeast-2)
NLB (Tokyo)        → ACM certificate (ap-northeast-1)
CloudFront         → ACM certificate (us-east-1) ← always
API Gateway (REST) → ACM certificate (that region)
API Gateway (Edge) → ACM certificate (us-east-1)

Automatic renewal #

ACM certificates are valid for 13 months by default and auto-renew starting 60 days before expiry.

  • A certificate created with DNS validation is fully automatic.
  • Email validation is manual, so DNS is recommended.

Automatic renewal fails when the DNS validation CNAME disappears, or when the ALB is using the certificate but the domain has moved elsewhere so validation fails. The ACM console automatically shows an expiry notice, and you can also set a CloudWatch alarm.

The one-line procedure to add HTTPS #

Assume you have a domain and an ALB.

The procedure to add HTTPS
1. Request a certificate in ACM (DNS validation)
2. Add the validation CNAME to Route 53 (one console click)
3. Wait for the certificate to be ISSUED (a few minutes)
4. Attach the certificate to the ALB Listener 443
5. Listener 80 → HTTPS redirect
6. The Route 53 domain → ALB Alias

These six steps are the operational standard pattern. There’s also no need to renew the certificate every year.

Security Policy — the TLS version #

The Listener’s Security Policy defines the allowed TLS versions and cipher suites.

Commonly used policies
ELBSecurityPolicy-TLS13-1-2-2021-06   ← recommended (TLS 1.3, 1.2)
ELBSecurityPolicy-TLS-1-2-2017-01     ← good compatibility
ELBSecurityPolicy-FS-2018-06          ← enforces Forward Secrecy

If you want to cut off old clients (TLS 1.0, 1.1), use TLS13-1-2-2021-06. A new policy comes out every year, so check periodically.

Connection Draining — graceful shutdown #

Its role is to wait so that in-flight requests aren’t cut off when you remove an instance manually or the ASG scales down.

Connection Draining
ALB ─ sends no new requests ─▶ EC2  (deregistration_delay = 300s)
        in-flight requests are processed to completion

The default is 300 seconds. Usually reducing to 30 ~ 60 seconds is enough. Too short cuts off in-flight requests, and too long makes deployment slow.

The LB’s SG and the EC2’s SG #

The SG pattern we saw in Chapter 9 EC2 Operations.

The SG of ALB ↔ EC2
ALB SG (sg-alb)
  Inbound:  443 ← 0.0.0.0/0
  Outbound: all

EC2 SG (sg-app)
  Inbound:  8080 ← sg-alb           ← the ALB SG itself
  Outbound: all

NLB is different. The NLB itself either has no SG (old NLB) or has just one SG (the new option). So the EC2 SG sees the client IP as-is, not the NLB’s IP. In operations, you sometimes add a NACL or VPC-level control to narrow the NLB client IP.

Common pitfalls #

  • The ACM certificate not getting ISSUED — Almost 99% of the time it’s that the validation CNAME didn’t go into Route 53, went into the wrong zone (the zone of api.example.com rather than example.com), or the TTL was too long so it hasn’t propagated. Wait about 5 minutes.
  • Issuing the CloudFront certificate in Seoul — CloudFront uses only us-east-1. When you receive it in the console, change the top-right to N. Virginia.
  • Not making an HTTP 80 Listener and having only HTTPS — When a user comes in via http://example.com, it times out. The standard is a redirect-to-HTTPS on the 80 Listener.
  • A health check path of / — If / does heavy work (a DB call, etc.), the ALB health check bombards the DB every 30 seconds. Place a light path like /health separately.
  • Confusing the Target Group’s port vs the Listener port — The ALB Listener may be 443 while the Target Group’s EC2 is 8080. The two ports are separate. The Target Group port is the port the EC2 is listening on.
  • 502 Bad Gateway — When the ALB → EC2 response breaks. Common causes are the EC2’s keep-alive timeout being shorter than the ALB idle timeout (60s default), the EC2 cutting off before responding, the EC2 SG not accepting the ALB SG on inbound, or the EC2 not listening on 8080.
  • Load skew from sticky sessions — Requests pile up on a particular instance and only one machine hits 100% CPU. Turn off Stickiness and keep it stateless.
  • Cross-Zone off — If an old ALB’s cross-zone option is off, distribution becomes even per-AZ, so when one AZ has fewer instances but the same amount of traffic, the load skews. Turn on Cross-zone load balancing (ALB is on by default, NLB is off by default for cost reasons).

Exercises #

  1. Write down the six steps of §“The one-line procedure to add HTTPS” without looking. Mark which steps are the Chapter 12 Route 53 work (adding the validation CNAME, the domain Alias), and distinguish the ACM work from the ALB work.
  2. Looking at the ALB vs NLB comparison table, answer what you’d choose for each of the following three workloads, based on §“A decision guide”. (a) a general REST API, (b) a TCP game server that needs a static IP for a firewall allowlist, (c) a website that needs WAF attached.
  3. Assume a situation where you issued a CloudFront certificate in Seoul and it isn’t getting ISSUED, and write down the cause and fix based on §“The certificate’s region-match rule”. This rule is confirmed again in Chapter 14 CloudFront.

In short: ELB splits into ALB / NLB / GWLB, and general web traffic is almost always an ALB use case. The ALB flow is Listener (receiving port) → Listener Rule (path/host) → Target Group (targets to send to) → Health Check, and making port 80’s default action an HTTPS redirect is the standard. ACM auto-renews free SSL certificates via DNS validation, and the ALB is in the same region while CloudFront is always us-east-1.

Next chapter #

We’ve got the roles up to the ALB in hand. Next, Chapter 14 CloudFront carries into CloudFront, which finally places a cache close to the user. We’ll lay out the flow of Origin / Behavior / Cache Policy, the S3 + CloudFront static hosting pattern, how to safely shield S3 with OAC, and invalidation.

X