AWS Certified Solutions Architect - Associate (SAA-C03) #6 Domain 2-1 Resilient Architectures — Multi-AZ , Auto Scaling , ELB

Thursday, May 14, 2026

6 min read

With #5 we finished the security domain (30%). Now we move into the second domain, resilience (26%). The first question of resilience is simple: “Does the service keep running even if a single instance, or even a whole data center, dies?” The foundation of that answer is Multi-AZ placement, Auto Scaling, and the load balancer.

Availability Zones (AZ) and high availability #

A Region is made up of multiple Availability Zones (AZ). Each AZ is one or more physically separated data centers with independent power, cooling, and networking. Even if one AZ fails, the others are unaffected.

The basic principle of high-availability design is to “spread resources across multiple AZs.” If you keep instances in only a single AZ, the service stops when that AZ fails. When the word “high availability” appears on the exam, Multi-AZ placement is almost always part of the answer.

Distinguish Multi-AZ (multiple AZs) from Multi-Region (multiple Regions). Most high availability is satisfied by Multi-AZ, while Multi-Region is a separate topic for surviving an entire Region failure or improving global latency (covered in #7 DR).

Auto Scaling Groups (ASG) #

An Auto Scaling group is the unit that automatically increases and decreases the number of EC2 instances to match demand. It is the core component that captures both resilience and cost efficiency at once.

Makeup #

Launch Template — defines which AMI, instance type, and security group to launch instances with
Capacity settings — minimum (min) / maximum (max) / desired instance count
Spread across multiple AZs — the ASG distributes instances evenly across the subnets (AZs) you specify, and rebalances when an imbalance occurs

Scaling policies #

Policy	Behavior	Typical use
Target Tracking	Keeps a metric at a target value (e.g. CPU 50%)	Most common , recommended
Step Scaling	Stepwise increase/decrease by metric range	Fine-grained control
Simple Scaling	A fixed amount when a threshold is crossed	Simple cases
Scheduled Scaling	Adjusts ahead of time at a set time	Predictable traffic (business hours)
Predictive Scaling	Predicts future demand from past patterns	Periodic patterns

The most common requirement, “automatically add instances when traffic increases,” is answered with Target Tracking. When you know the time, as in “traffic spikes every day at 9 AM,” it is Scheduled Scaling.

Replacing failed instances via health checks #

The ASG detects unhealthy instances via health checks and terminates and replaces them with new instances. There are two kinds of health check.

EC2 health check — looks only at the instance state (system/instance status checks).
ELB health check — the load balancer inspects at the application level (e.g. HTTP 200). To catch cases where the instance is alive but the app is dead, you must enable the ELB health check.

Elastic Load Balancing (ELB) #

ELB distributes traffic across multiple instances (or containers , Lambda) and, via health checks, only sends it to healthy targets. There are three types, distinguished by layer and use.

Type	Layer	Protocol	Characteristics
ALB (Application)	L7	HTTP/HTTPS	Path/host-based routing, container , Lambda targets, WebSocket
NLB (Network)	L4	TCP/UDP/TLS	Ultra-high performance , ultra-low latency, static IP (Elastic IP)
GLB (Gateway)	L3	IP	Deploy and scale third-party virtual appliances (firewall , IDS)

ALB — if you need to route to different target groups by HTTP path (/api, /img) or host header, use ALB. It is the default choice for microservice and container environments.
NLB — if you need millions of RPS, ultra-low latency, TCP/UDP, or a static IP, use NLB. ALB has no static IP and is exposed only by a DNS name.
GLB — use it when traffic must pass through a third-party appliance such as a firewall.

ELB features to know #

SSL/TLS termination — the load balancer handles the certificate (ACM integration)
Sticky sessions — pins the same client to the same target
Cross-Zone load balancing — distributes evenly across targets in all AZs. Enabled by default for ALB, disabled by default for NLB
Connection draining (Deregistration Delay) — waits for in-flight requests to finish before removing a target

How they work together #

A typical high-availability web tier is wired up like this.

Subnets spanning multiple AZs
An ALB at the front (traffic distribution + health checks)
An Auto Scaling group behind it (scaling with demand, replacing failed instances)

With this combination, the ASG absorbs instance failures, multi-AZ spreading absorbs AZ failures, and scaling absorbs traffic spikes.

Exam question patterns #

“HTTP path/host-based routing.” → ALB
“Need ultra-high-performance TCP or a static IP.” → NLB
“Insert a third-party firewall appliance.” → GLB
“Automatically scale EC2 with traffic.” → ASG + Target Tracking
“Prepare for a traffic spike at a specific time each day.” → Scheduled Scaling
“Replace cases where the instance is alive but the app is dead.” → use ELB health checks
“High availability is required.” → spread across multiple AZs + ELB + ASG

Common pitfalls #

1) Thinking ALB has a static IP #

A static IP (Elastic IP) is a feature of NLB. ALB is exposed only by a DNS name.

2) Mistaking Auto Scaling for vertical scaling #

ASG is horizontal scaling (increasing/decreasing the instance count). It differs from vertical scaling, where you swap an instance for a larger type.

3) Trying to catch app failures with EC2 health checks #

EC2 health checks look only at instance state. Application-level failures are detected with ELB health checks.

4) Confusing high availability with Multi-Region #

Most high availability is solved with Multi-AZ. Multi-Region is for surviving a Region failure and improving global latency, and it costs a lot.

Summary #

What we nailed down in this post:

The foundation of high availability is spreading across multiple AZs. Distinguish Multi-AZ from Multi-Region
Auto Scaling — horizontal scaling. Target Tracking is the default, Scheduled for known times. Health checks replace failed instances
ELB — ALB (L7 path routing) , NLB (L4 ultra-high performance , static IP) , GLB (appliances)
App-failure detection is the ELB health check. Cross-zone balancing is enabled by default for ALB / disabled by default for NLB
The standard pattern — the combination of multi-AZ + ALB + ASG

Next — Domain 2-2 DR Patterns #

Now that we have resilience within a single Region nailed down, next comes DR strategy for surviving a Region-level disaster.

In #7 Domain 2-2 DR Patterns we will cover the meaning of RTO and RPO, the cost and recovery-time trade-offs of the four DR strategies (Backup & Restore , Pilot Light , Warm Standby , Multi-Site Active/Active), and how to implement them with Route 53 failover and cross-Region replication.