EC2 and VPC Basics
The cloud's oldest compute and network, EC2 and VPC. How instance types, AMIs, and EBS, plus VPC / subnets / route tables / IGW / NAT all weave into one picture — laying the first skeleton of your operational infrastructure.
If you finished the preparation phase in Part 1 — accounts and IAM, cost, CLI, security, and CloudWatch — this chapter is where you actually start putting something up. Part 2’s goal is to thread the components you meet most often on AWS — EC2, VPC, S3, RDS, Route 53, ALB, and CloudFront — into one operational picture.
The first step is EC2 and VPC. They are the cloud’s oldest, most fundamental tools. EC2 is a single virtual computer, and VPC is the network those computers live in. The two always travel together. The skeleton we lay in this chapter carries straight into the security rules and access of Chapter 9 EC2 Operations and the subnet placement of Chapter 11 RDS.
In this chapter we walk through every role that gets created behind the scenes when you launch a single EC2 instance — instance types, AMI, EBS, VPC, subnets, route tables, IGW, NAT Gateway.
The big picture #
When you launch one EC2 instance from the console, the following are actually all created behind it.
┌──────────────────────────────────┐
│ VPC │
│ 10.0.0.0/16 │
│ │
│ ┌──────────────────────────┐ │
│ │ Subnet (Public) │ │
│ │ 10.0.1.0/24 │ │
│ │ │ │
│ │ ┌────────────┐ │ │
│ │ │ EC2 instance│ │ │
│ │ │ AMI = OS │ │ │
│ │ │ EBS = disk │ │ │
│ │ │ ENI = network│ │ │
│ │ └────────────┘ │ │
│ └──────────┬───────────────┘ │
│ │ │
│ ▼ │
│ Route Table │
│ │ │
│ ▼ │
│ Internet Gateway (IGW) │
└──────────────┼───────────────────┘
▼
internetLet’s walk through the roles that appear in this picture in order.
EC2 — a single virtual computer #
EC2 (Elastic Compute Cloud) is AWS’s virtual machine service. When you launch one of what the console calls an Instance, that one is a single virtual Linux or Windows computer.
Instance types #
For EC2 instance types, the name is the spec.
t3.micro
│ │ └── size (nano / micro / small / medium / large / xlarge / 2xlarge / ...)
│ └───── generation (1, 2, 3, 4, 5, 6, 7 ...)
└──────── familyThe commonly used families are as follows.
| Family | Nickname | Role |
|---|---|---|
t | Burstable | Workloads that normally use little but suddenly spike. The default for dev / side projects |
m | General purpose | Balanced CPU / memory. General web servers |
c | Compute optimized | CPU-heavy work (encoding, build servers) |
r | Memory optimized | Memory-heavy work (Redis, large caches) |
i, d | Storage optimized | Work with large local NVMe / HDD (DBs, data pipelines) |
g, p | GPU | Machine learning / graphics |
At first almost everyone starts with t3.micro / t3.small / t3.medium. The t family uses a CPU credit model, so it is cost-efficient when average utilization is low. Once you move into operations, it’s common to switch to m.
t3 vs t3a vs t4g.
t3is Intel,t3ais AMD (about 10% cheaper), andt4gis ARM (Graviton, about 20% cheaper while being fast). For new workloads, reviewt4gfirst after confirming compatibility.
AMI — the OS image #
AMI (Amazon Machine Image) is the OS image used to launch an instance. It’s a snapshot of an OS like “Ubuntu 22.04” or “Amazon Linux 2023” plus pre-installed tools.
The commonly used AMI kinds are as follows.
- Amazon Linux 2023 — an RHEL-family distro built and maintained by AWS. The smoothest on EC2.
- Ubuntu LTS — the most familiar choice. 22.04 / 24.04.
- Debian — lighter than Ubuntu.
- Windows Server — license cost is included.
- A user-created AMI — created by snapshotting a running instance (Chapter 9 EC2 Operations).
AMIs are per-region. You cannot use the Seoul region’s AMI as-is in the Tokyo region. You can move it with an AMI copy command.
EBS — the disk #
EBS (Elastic Block Store) is the block storage attached to EC2, that is, a virtual SSD. Even when you terminate an instance, EBS is a separate resource that survives. Without EBS, EC2 is an empty shell.
The volume types are as follows.
| Type | Abbrev. | Role |
|---|---|---|
| General Purpose SSD | gp3 | Default. Good price/performance |
| General Purpose SSD (old) | gp2 | Old default. Use gp3 for new work |
| Provisioned IOPS SSD | io2 / io2 Block Express | DBs needing high IOPS |
| Throughput Optimized HDD | st1 | Large sequential reads (big data) |
| Cold HDD | sc1 | Backups accessed rarely |
Most start with gp3. It’s 20% cheaper than gp2 while letting you tune IOPS and Throughput separately.
There’s also another kind called Instance Store. It’s NVMe physically attached to the instance, so it’s fast, but it disappears when the instance is terminated. Use it only for cache or temporary purposes.
VPC — the virtual network #
VPC (Virtual Private Cloud) is your own network created inside AWS. It’s the private IP space where resources like EC2 / RDS / Lambda live.
CIDR block #
The first thing you decide when creating a VPC is the CIDR block, that is, the IP address range.
10.0.0.0/16 # 10.0.0.0 ~ 10.0.255.255 (65,536)
172.16.0.0/16 # 172.16.0.0 ~ 172.16.255.255
192.168.0.0/16 # private IP spaceA /16 has 65,536 IPs, and a /24 has 256. It’s common to start with a /16 and carve subnets out of it.
Use private IP space only. You must pick within RFC 1918’s private ranges (
10.0.0.0/8,172.16.0.0/12,192.168.0.0/16). If you use a public IP as the VPC CIDR, internet access to that IP may be blocked.
Default VPC #
A new account has a Default VPC pre-created in every region. CIDR 172.31.0.0/16, one public subnet in each AZ, and an IGW auto-attached.
The default VPC is enough for learning or prototypes. For operations, the common practice is to move to a freshly created custom VPC. The default VPC has everything exposed to the internet, so it’s weak from a security standpoint. VPC design is covered in more depth in Chapter 28 VPC in Depth.
Subnets — compartments inside a VPC #
The unit that subdivides a VPC is a Subnet. A subnet belongs to one AZ. So for Multi-AZ operation, you must create subnets across multiple AZs as well.
VPC 10.0.0.0/16
│
├── Public Subnet A 10.0.1.0/24 (AZ a) ── ALB, NAT, Bastion
├── Public Subnet B 10.0.2.0/24 (AZ b)
│
├── Private Subnet A 10.0.10.0/24 (AZ a) ── EC2 (app)
├── Private Subnet B 10.0.11.0/24 (AZ b)
│
└── Database Subnet A 10.0.20.0/24 (AZ a) ── RDS
Database Subnet B 10.0.21.0/24 (AZ b)You don’t decide a subnet’s kind yourself; the route table decides it. There is no separate “public” attribute — a subnet that has a route to an IGW is, by definition, public.
Public vs Private subnets #
| Public subnet | Private subnet | |
|---|---|---|
| Internet in/out | Directly possible | Out only, via NAT |
| Auto-assign public IP | If you turn the option on | Usually off |
| Resources placed | ALB, NAT GW, Bastion | EC2 (app), Lambda |
| Routing | 0.0.0.0/0 to IGW | 0.0.0.0/0 to NAT GW |
The operational pattern is to put app servers in Private and let an ALB receive external exposure in Public. Since no internet IP is attached directly to the app servers, the attack surface shrinks.
Route tables — where to send it #
A Route Table defines where a subnet’s traffic goes. One route table attaches to one subnet.
| Destination | Target |
| -------------- | --------------- |
| 10.0.0.0/16 | local | ← within the VPC is always local
| 0.0.0.0/0 | igw-xxxxxxxx | ← everything else goes to the IGW (internet)| Destination | Target |
| -------------- | --------------- |
| 10.0.0.0/16 | local |
| 0.0.0.0/0 | nat-xxxxxxxx | ← everything else goes to the NAT GWThe local route is added automatically and cannot be removed. All subnets inside the same VPC can always communicate with each other (though SG / NACL may block them).
IGW — Internet Gateway #
The IGW is a unidirectional gateway connecting the VPC to the internet. You attach one per VPC (if you don’t attach one, it’s a fully private VPC).
The IGW itself does not route traffic. Traffic only flows once you add the IGW as a target in a route table. That’s why, even within a VPC using the same IGW, you can make only some subnets public.
The IGW is free, and its availability is managed automatically.
NAT Gateway — from private out to the internet #
For an EC2 in a private subnet to receive OS patches or call external APIs, it needs internet access. But the internet must not be able to come in to that EC2. NAT (Network Address Translation) plays that role.
[Private EC2] ──out──▶ [NAT GW (Public Subnet)] ──out──▶ [IGW] ──▶ internet
◀ no inboundYou place the NAT GW in a public subnet (it must itself be visible to the internet to communicate with the IGW). When the private subnet’s route table points to the NAT GW as the target, the private subnet’s traffic goes out to the internet through the NAT.
NAT GW vs NAT Instance #
In the old days, you ran NAT yourself on a single EC2 (a NAT Instance). Now you almost always use the managed NAT Gateway.
| NAT Gateway | NAT Instance (old) | |
|---|---|---|
| Management | AWS manages it | Yourself |
| Availability | Automatic per AZ (for Multi-AZ, place one per AZ) | A single EC2 |
| Bandwidth | Up to 100 Gbps | Depends on EC2 size |
| Cost | Per hour + per GB | EC2 cost only |
The NAT GW cost trap #
The NAT Gateway is billed both per hour and per GB of data processed. With heavy traffic, it can become a surprisingly large cost. Ways to save are as follows.
- Placing only one NAT GW instead of one per AZ reduces cost but lowers availability (when a single AZ fails, the Private subnets in other AZs lose internet too).
- Route S3 / DynamoDB traffic around the NAT with a Gateway VPC Endpoint (free).
- Services like ECR / Secrets Manager can also bypass the NAT with an Interface VPC Endpoint (there’s a per-hour cost, but no per-GB traffic cost).
The kinds and design of VPC Endpoints are covered in earnest in Chapter 28 VPC in Depth.
Security Group vs NACL — the firewall #
A VPC’s firewall operates in two layers. The detailed design is covered in Chapter 9 EC2 Operations, so here we’ll just look at the picture.
internet ──▶ NACL (subnet level) ──▶ Security Group (instance level) ──▶ EC2| Security Group | NACL | |
|---|---|---|
| Applies at | Instance (ENI) | Subnet |
| Stateful | Responses auto-allowed | No |
| Rules | Allow only (no Deny) | Allow + Deny |
| Frequency of use | Daily | Rarely touched |
In operations you almost only touch SGs and leave NACLs at their defaults.
Launching a single EC2 — CLI #
Instead of clicking in the console, you can launch one in a single line with the CLI.
aws ec2 run-instances \
--image-id ami-0f3a440bbcff3d043 \
--instance-type t3.micro \
--key-name my-key \
--security-group-ids sg-0abc1234def567890 \
--subnet-id subnet-0abc1234def567890 \
--tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=my-server}]'Every choice you made in the console maps to a flag, one by one. Once you’ve seen the CLI, it becomes clear what the console was asking you to choose. For CLI / SDK setup, see Chapter 4 AWS CLI and SDK.
Common pitfalls #
- “Why can’t my EC2 reach the internet?” — Check the following in order. (1) Is a public IP attached (the subnet’s auto-assign public IP, or an EIP)? (2) Is the subnet public (does the route table have
0.0.0.0/0 → igw-...)? (3) Does the SG allow outbound? (4) Does the NACL allow outbound? (5) Is the OS-level firewall (ufw,iptables) not blocking it? It’s usually (1) or (2), and if it’s a private subnet, check the NAT GW path. - Sizing the VPC CIDR too small — If you start with
10.0.0.0/24(256 IPs), you’ll run short on subnets later. A VPC CIDR is hard to change after creation (adding a secondary CIDR is possible). Allocate generously with a/16from the start. - The NAT Gateway showing up large on the bill — If the NAT GW alone is tens of dollars a month, check the traffic volume. If you have many S3 / ECR calls, bypass the NAT with VPC Endpoints.
- Losing the key pair — If you lose the SSH access key pair, you essentially have to terminate and recreate the instance. Because of this risk, the trend is increasingly to move to SSM Session Manager (Chapter 9 EC2 Operations).
- Placing RDS alone in the wrong AZ — If EC2 is in AZ a and RDS is only in AZ b, it works, but you incur cross-AZ data transfer costs. Put them in the same AZ, or make RDS Multi-AZ.
- EBS living longer than the instance — Check whether EBS is auto-deleted on instance termination (
DeleteOnTermination). If the old option isfalse, it’s a common accident that only the EBS of a terminated instance remains and gets billed for a month.
Exercises #
- Looking at the “Example of subnet partitioning inside a VPC” diagram, pick one VPC CIDR (
/16) for a service you’d build, then carve out Public · Private · Database subnets — two per AZ, six total — and write down their IP ranges. In Chapter 25 Intro to Terraform, this partitioning becomesaws_subnetresources. - Write down the five-step checklist from “Why can’t my EC2 reach the internet?” without looking. Then, based on §“Route tables” and §“Security Group vs NACL”, classify which steps are route-table problems and which are SG / NACL problems.
- Write down the three ways to reduce NAT Gateway cost, and note in one sentence each what trade-off each has in terms of availability and whether it’s free. This note will be reused in Chapter 27 Cost Optimization.
In short: EC2 is a single virtual computer where the instance type sets the spec, the AMI sets the OS, and EBS sets the disk. A VPC is a private IP network, and a subnet inside it becomes public if its route table points to an IGW and private if it points to a NAT. The IGW is free, but the NAT Gateway is billed both per hour and per GB, so bypass it with VPC endpoints.
Next chapter #
In this chapter we’ve laid out the picture of launching a single EC2. Next, Chapter 9 EC2 Operations moves to the everyday tools for handling that EC2 safely. We’ll cover Security Group rule design, the limits of key pairs and migration to SSM Session Manager, and how to harden the skeleton with an AMI.