AWS Certified Solutions Architect - Associate (SAA-C03) #6 Domain 2-1 Resilient Architectures — Multi-AZ , Auto Scaling , ELB
With #5 we finished the security domain (30%). Now we move into the second domain, resilience (26%). The first question of resilience is simple: “Does the service keep running even if a single instance, or even a whole data center, dies?” The foundation of that answer is Multi-AZ placement, Auto Scaling, and the load balancer.
Availability Zones (AZ) and high availability #
A Region is made up of multiple Availability Zones (AZ). Each AZ is one or more physically separated data centers with independent power, cooling, and networking. Even if one AZ fails, the others are unaffected.
The basic principle of high-availability design is to “spread resources across multiple AZs.” If you keep instances in only a single AZ, the service stops when that AZ fails. When the word “high availability” appears on the exam, Multi-AZ placement is almost always part of the answer.
Distinguish Multi-AZ (multiple AZs) from Multi-Region (multiple Regions). Most high availability is satisfied by Multi-AZ, while Multi-Region is a separate topic for surviving an entire Region failure or improving global latency (covered in #7 DR).
Auto Scaling Groups (ASG) #
An Auto Scaling group is the unit that automatically increases and decreases the number of EC2 instances to match demand. It is the core component that captures both resilience and cost efficiency at once.
Makeup #
- Launch Template — defines which AMI, instance type, and security group to launch instances with
- Capacity settings — minimum (min) / maximum (max) / desired instance count
- Spread across multiple AZs — the ASG distributes instances evenly across the subnets (AZs) you specify, and rebalances when an imbalance occurs
Scaling policies #
| Policy | Behavior | Typical use |
|---|---|---|
| Target Tracking | Keeps a metric at a target value (e.g. CPU 50%) | Most common , recommended |
| Step Scaling | Stepwise increase/decrease by metric range | Fine-grained control |
| Simple Scaling | A fixed amount when a threshold is crossed | Simple cases |
| Scheduled Scaling | Adjusts ahead of time at a set time | Predictable traffic (business hours) |
| Predictive Scaling | Predicts future demand from past patterns | Periodic patterns |
The most common requirement, “automatically add instances when traffic increases,” is answered with Target Tracking. When you know the time, as in “traffic spikes every day at 9 AM,” it is Scheduled Scaling.
Replacing failed instances via health checks #
The ASG detects unhealthy instances via health checks and terminates and replaces them with new instances. There are two kinds of health check.
- EC2 health check — looks only at the instance state (system/instance status checks).
- ELB health check — the load balancer inspects at the application level (e.g. HTTP 200). To catch cases where the instance is alive but the app is dead, you must enable the ELB health check.
Elastic Load Balancing (ELB) #
ELB distributes traffic across multiple instances (or containers , Lambda) and, via health checks, only sends it to healthy targets. There are three types, distinguished by layer and use.
| Type | Layer | Protocol | Characteristics |
|---|---|---|---|
| ALB (Application) | L7 | HTTP/HTTPS | Path/host-based routing, container , Lambda targets, WebSocket |
| NLB (Network) | L4 | TCP/UDP/TLS | Ultra-high performance , ultra-low latency, static IP (Elastic IP) |
| GLB (Gateway) | L3 | IP | Deploy and scale third-party virtual appliances (firewall , IDS) |
- ALB — if you need to route to different target groups by HTTP path (
/api,/img) or host header, use ALB. It is the default choice for microservice and container environments. - NLB — if you need millions of RPS, ultra-low latency, TCP/UDP, or a static IP, use NLB. ALB has no static IP and is exposed only by a DNS name.
- GLB — use it when traffic must pass through a third-party appliance such as a firewall.
ELB features to know #
- SSL/TLS termination — the load balancer handles the certificate (ACM integration)
- Sticky sessions — pins the same client to the same target
- Cross-Zone load balancing — distributes evenly across targets in all AZs. Enabled by default for ALB, disabled by default for NLB
- Connection draining (Deregistration Delay) — waits for in-flight requests to finish before removing a target
How they work together #
A typical high-availability web tier is wired up like this.
- Subnets spanning multiple AZs
- An ALB at the front (traffic distribution + health checks)
- An Auto Scaling group behind it (scaling with demand, replacing failed instances)
With this combination, the ASG absorbs instance failures, multi-AZ spreading absorbs AZ failures, and scaling absorbs traffic spikes.
Exam question patterns #
- “HTTP path/host-based routing.” → ALB
- “Need ultra-high-performance TCP or a static IP.” → NLB
- “Insert a third-party firewall appliance.” → GLB
- “Automatically scale EC2 with traffic.” → ASG + Target Tracking
- “Prepare for a traffic spike at a specific time each day.” → Scheduled Scaling
- “Replace cases where the instance is alive but the app is dead.” → use ELB health checks
- “High availability is required.” → spread across multiple AZs + ELB + ASG
Common pitfalls #
1) Thinking ALB has a static IP #
A static IP (Elastic IP) is a feature of NLB. ALB is exposed only by a DNS name.
2) Mistaking Auto Scaling for vertical scaling #
ASG is horizontal scaling (increasing/decreasing the instance count). It differs from vertical scaling, where you swap an instance for a larger type.
3) Trying to catch app failures with EC2 health checks #
EC2 health checks look only at instance state. Application-level failures are detected with ELB health checks.
4) Confusing high availability with Multi-Region #
Most high availability is solved with Multi-AZ. Multi-Region is for surviving a Region failure and improving global latency, and it costs a lot.
Summary #
What we nailed down in this post:
- The foundation of high availability is spreading across multiple AZs. Distinguish Multi-AZ from Multi-Region
- Auto Scaling — horizontal scaling. Target Tracking is the default, Scheduled for known times. Health checks replace failed instances
- ELB — ALB (L7 path routing) , NLB (L4 ultra-high performance , static IP) , GLB (appliances)
- App-failure detection is the ELB health check. Cross-zone balancing is enabled by default for ALB / disabled by default for NLB
- The standard pattern — the combination of multi-AZ + ALB + ASG
Next — Domain 2-2 DR Patterns #
Now that we have resilience within a single Region nailed down, next comes DR strategy for surviving a Region-level disaster.
In #7 Domain 2-2 DR Patterns we will cover the meaning of RTO and RPO, the cost and recovery-time trade-offs of the four DR strategies (Backup & Restore , Pilot Light , Warm Standby , Multi-Site Active/Active), and how to implement them with Route 53 failover and cross-Region replication.