Contents
8 Chapter

EC2 and VPC Basics

The cloud's oldest compute and network, EC2 and VPC. How instance types, AMIs, and EBS, plus VPC / subnets / route tables / IGW / NAT all weave into one picture — laying the first skeleton of your operational infrastructure.

If you finished the preparation phase in Part 1 — accounts and IAM, cost, CLI, security, and CloudWatch — this chapter is where you actually start putting something up. Part 2’s goal is to thread the components you meet most often on AWS — EC2, VPC, S3, RDS, Route 53, ALB, and CloudFront — into one operational picture.

The first step is EC2 and VPC. They are the cloud’s oldest, most fundamental tools. EC2 is a single virtual computer, and VPC is the network those computers live in. The two always travel together. The skeleton we lay in this chapter carries straight into the security rules and access of Chapter 9 EC2 Operations and the subnet placement of Chapter 11 RDS.

In this chapter we walk through every role that gets created behind the scenes when you launch a single EC2 instance — instance types, AMI, EBS, VPC, subnets, route tables, IGW, NAT Gateway.

The big picture #

When you launch one EC2 instance from the console, the following are actually all created behind it.

The full stack of a single EC2 instance
              ┌──────────────────────────────────┐
              │              VPC                 │
              │  10.0.0.0/16                     │
              │                                  │
              │   ┌──────────────────────────┐   │
              │   │   Subnet (Public)        │   │
              │   │   10.0.1.0/24            │   │
              │   │                          │   │
              │   │     ┌────────────┐       │   │
              │   │     │ EC2 instance│       │   │
              │   │     │ AMI = OS    │       │   │
              │   │     │ EBS = disk  │       │   │
              │   │     │ ENI = network│      │   │
              │   │     └────────────┘       │   │
              │   └──────────┬───────────────┘   │
              │              │                   │
              │              ▼                   │
              │      Route Table                 │
              │              │                   │
              │              ▼                   │
              │      Internet Gateway (IGW)      │
              └──────────────┼───────────────────┘
                          internet

Let’s walk through the roles that appear in this picture in order.

EC2 — a single virtual computer #

EC2 (Elastic Compute Cloud) is AWS’s virtual machine service. When you launch one of what the console calls an Instance, that one is a single virtual Linux or Windows computer.

Instance types #

For EC2 instance types, the name is the spec.

The shape of an instance type
t3.micro
│  │  └── size (nano / micro / small / medium / large / xlarge / 2xlarge / ...)
│  └───── generation (1, 2, 3, 4, 5, 6, 7 ...)
└──────── family

The commonly used families are as follows.

FamilyNicknameRole
tBurstableWorkloads that normally use little but suddenly spike. The default for dev / side projects
mGeneral purposeBalanced CPU / memory. General web servers
cCompute optimizedCPU-heavy work (encoding, build servers)
rMemory optimizedMemory-heavy work (Redis, large caches)
i, dStorage optimizedWork with large local NVMe / HDD (DBs, data pipelines)
g, pGPUMachine learning / graphics

At first almost everyone starts with t3.micro / t3.small / t3.medium. The t family uses a CPU credit model, so it is cost-efficient when average utilization is low. Once you move into operations, it’s common to switch to m.

t3 vs t3a vs t4g. t3 is Intel, t3a is AMD (about 10% cheaper), and t4g is ARM (Graviton, about 20% cheaper while being fast). For new workloads, review t4g first after confirming compatibility.

AMI — the OS image #

AMI (Amazon Machine Image) is the OS image used to launch an instance. It’s a snapshot of an OS like “Ubuntu 22.04” or “Amazon Linux 2023” plus pre-installed tools.

The commonly used AMI kinds are as follows.

  • Amazon Linux 2023 — an RHEL-family distro built and maintained by AWS. The smoothest on EC2.
  • Ubuntu LTS — the most familiar choice. 22.04 / 24.04.
  • Debian — lighter than Ubuntu.
  • Windows Server — license cost is included.
  • A user-created AMI — created by snapshotting a running instance (Chapter 9 EC2 Operations).

AMIs are per-region. You cannot use the Seoul region’s AMI as-is in the Tokyo region. You can move it with an AMI copy command.

EBS — the disk #

EBS (Elastic Block Store) is the block storage attached to EC2, that is, a virtual SSD. Even when you terminate an instance, EBS is a separate resource that survives. Without EBS, EC2 is an empty shell.

The volume types are as follows.

TypeAbbrev.Role
General Purpose SSDgp3Default. Good price/performance
General Purpose SSD (old)gp2Old default. Use gp3 for new work
Provisioned IOPS SSDio2 / io2 Block ExpressDBs needing high IOPS
Throughput Optimized HDDst1Large sequential reads (big data)
Cold HDDsc1Backups accessed rarely

Most start with gp3. It’s 20% cheaper than gp2 while letting you tune IOPS and Throughput separately.

There’s also another kind called Instance Store. It’s NVMe physically attached to the instance, so it’s fast, but it disappears when the instance is terminated. Use it only for cache or temporary purposes.

VPC — the virtual network #

VPC (Virtual Private Cloud) is your own network created inside AWS. It’s the private IP space where resources like EC2 / RDS / Lambda live.

CIDR block #

The first thing you decide when creating a VPC is the CIDR block, that is, the IP address range.

Commonly used VPC CIDRs
10.0.0.0/16    # 10.0.0.0 ~ 10.0.255.255 (65,536)
172.16.0.0/16  # 172.16.0.0 ~ 172.16.255.255
192.168.0.0/16 # private IP space

A /16 has 65,536 IPs, and a /24 has 256. It’s common to start with a /16 and carve subnets out of it.

Use private IP space only. You must pick within RFC 1918’s private ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16). If you use a public IP as the VPC CIDR, internet access to that IP may be blocked.

Default VPC #

A new account has a Default VPC pre-created in every region. CIDR 172.31.0.0/16, one public subnet in each AZ, and an IGW auto-attached.

The default VPC is enough for learning or prototypes. For operations, the common practice is to move to a freshly created custom VPC. The default VPC has everything exposed to the internet, so it’s weak from a security standpoint. VPC design is covered in more depth in Chapter 28 VPC in Depth.

Subnets — compartments inside a VPC #

The unit that subdivides a VPC is a Subnet. A subnet belongs to one AZ. So for Multi-AZ operation, you must create subnets across multiple AZs as well.

Example of subnet partitioning inside a VPC
VPC 10.0.0.0/16
├── Public Subnet A   10.0.1.0/24  (AZ a) ── ALB, NAT, Bastion
├── Public Subnet B   10.0.2.0/24  (AZ b)
├── Private Subnet A  10.0.10.0/24 (AZ a) ── EC2 (app)
├── Private Subnet B  10.0.11.0/24 (AZ b)
└── Database Subnet A 10.0.20.0/24 (AZ a) ── RDS
    Database Subnet B 10.0.21.0/24 (AZ b)

You don’t decide a subnet’s kind yourself; the route table decides it. There is no separate “public” attribute — a subnet that has a route to an IGW is, by definition, public.

Public vs Private subnets #

Public subnetPrivate subnet
Internet in/outDirectly possibleOut only, via NAT
Auto-assign public IPIf you turn the option onUsually off
Resources placedALB, NAT GW, BastionEC2 (app), Lambda
Routing0.0.0.0/0 to IGW0.0.0.0/0 to NAT GW

The operational pattern is to put app servers in Private and let an ALB receive external exposure in Public. Since no internet IP is attached directly to the app servers, the attack surface shrinks.

Route tables — where to send it #

A Route Table defines where a subnet’s traffic goes. One route table attaches to one subnet.

A public subnet's route table
| Destination    | Target          |
| -------------- | --------------- |
| 10.0.0.0/16    | local           |  ← within the VPC is always local
| 0.0.0.0/0      | igw-xxxxxxxx    |  ← everything else goes to the IGW (internet)
A private subnet's route table
| Destination    | Target          |
| -------------- | --------------- |
| 10.0.0.0/16    | local           |
| 0.0.0.0/0      | nat-xxxxxxxx    |  ← everything else goes to the NAT GW

The local route is added automatically and cannot be removed. All subnets inside the same VPC can always communicate with each other (though SG / NACL may block them).

IGW — Internet Gateway #

The IGW is a unidirectional gateway connecting the VPC to the internet. You attach one per VPC (if you don’t attach one, it’s a fully private VPC).

The IGW itself does not route traffic. Traffic only flows once you add the IGW as a target in a route table. That’s why, even within a VPC using the same IGW, you can make only some subnets public.

The IGW is free, and its availability is managed automatically.

NAT Gateway — from private out to the internet #

For an EC2 in a private subnet to receive OS patches or call external APIs, it needs internet access. But the internet must not be able to come in to that EC2. NAT (Network Address Translation) plays that role.

NAT Gateway's traffic flow
[Private EC2] ──out──▶ [NAT GW (Public Subnet)] ──out──▶ [IGW] ──▶ internet
                                                ◀ no inbound

You place the NAT GW in a public subnet (it must itself be visible to the internet to communicate with the IGW). When the private subnet’s route table points to the NAT GW as the target, the private subnet’s traffic goes out to the internet through the NAT.

NAT GW vs NAT Instance #

In the old days, you ran NAT yourself on a single EC2 (a NAT Instance). Now you almost always use the managed NAT Gateway.

NAT GatewayNAT Instance (old)
ManagementAWS manages itYourself
AvailabilityAutomatic per AZ (for Multi-AZ, place one per AZ)A single EC2
BandwidthUp to 100 GbpsDepends on EC2 size
CostPer hour + per GBEC2 cost only

The NAT GW cost trap #

The NAT Gateway is billed both per hour and per GB of data processed. With heavy traffic, it can become a surprisingly large cost. Ways to save are as follows.

  • Placing only one NAT GW instead of one per AZ reduces cost but lowers availability (when a single AZ fails, the Private subnets in other AZs lose internet too).
  • Route S3 / DynamoDB traffic around the NAT with a Gateway VPC Endpoint (free).
  • Services like ECR / Secrets Manager can also bypass the NAT with an Interface VPC Endpoint (there’s a per-hour cost, but no per-GB traffic cost).

The kinds and design of VPC Endpoints are covered in earnest in Chapter 28 VPC in Depth.

Security Group vs NACL — the firewall #

A VPC’s firewall operates in two layers. The detailed design is covered in Chapter 9 EC2 Operations, so here we’ll just look at the picture.

How a packet reaches an EC2
internet ──▶ NACL (subnet level) ──▶ Security Group (instance level) ──▶ EC2
Security GroupNACL
Applies atInstance (ENI)Subnet
StatefulResponses auto-allowedNo
RulesAllow only (no Deny)Allow + Deny
Frequency of useDailyRarely touched

In operations you almost only touch SGs and leave NACLs at their defaults.

Launching a single EC2 — CLI #

Instead of clicking in the console, you can launch one in a single line with the CLI.

Launching an EC2
aws ec2 run-instances \
  --image-id ami-0f3a440bbcff3d043 \
  --instance-type t3.micro \
  --key-name my-key \
  --security-group-ids sg-0abc1234def567890 \
  --subnet-id subnet-0abc1234def567890 \
  --tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=my-server}]'

Every choice you made in the console maps to a flag, one by one. Once you’ve seen the CLI, it becomes clear what the console was asking you to choose. For CLI / SDK setup, see Chapter 4 AWS CLI and SDK.

Common pitfalls #

  • “Why can’t my EC2 reach the internet?” — Check the following in order. (1) Is a public IP attached (the subnet’s auto-assign public IP, or an EIP)? (2) Is the subnet public (does the route table have 0.0.0.0/0 → igw-...)? (3) Does the SG allow outbound? (4) Does the NACL allow outbound? (5) Is the OS-level firewall (ufw, iptables) not blocking it? It’s usually (1) or (2), and if it’s a private subnet, check the NAT GW path.
  • Sizing the VPC CIDR too small — If you start with 10.0.0.0/24 (256 IPs), you’ll run short on subnets later. A VPC CIDR is hard to change after creation (adding a secondary CIDR is possible). Allocate generously with a /16 from the start.
  • The NAT Gateway showing up large on the bill — If the NAT GW alone is tens of dollars a month, check the traffic volume. If you have many S3 / ECR calls, bypass the NAT with VPC Endpoints.
  • Losing the key pair — If you lose the SSH access key pair, you essentially have to terminate and recreate the instance. Because of this risk, the trend is increasingly to move to SSM Session Manager (Chapter 9 EC2 Operations).
  • Placing RDS alone in the wrong AZ — If EC2 is in AZ a and RDS is only in AZ b, it works, but you incur cross-AZ data transfer costs. Put them in the same AZ, or make RDS Multi-AZ.
  • EBS living longer than the instance — Check whether EBS is auto-deleted on instance termination (DeleteOnTermination). If the old option is false, it’s a common accident that only the EBS of a terminated instance remains and gets billed for a month.

Exercises #

  1. Looking at the “Example of subnet partitioning inside a VPC” diagram, pick one VPC CIDR (/16) for a service you’d build, then carve out Public · Private · Database subnets — two per AZ, six total — and write down their IP ranges. In Chapter 25 Intro to Terraform, this partitioning becomes aws_subnet resources.
  2. Write down the five-step checklist from “Why can’t my EC2 reach the internet?” without looking. Then, based on §“Route tables” and §“Security Group vs NACL”, classify which steps are route-table problems and which are SG / NACL problems.
  3. Write down the three ways to reduce NAT Gateway cost, and note in one sentence each what trade-off each has in terms of availability and whether it’s free. This note will be reused in Chapter 27 Cost Optimization.

In short: EC2 is a single virtual computer where the instance type sets the spec, the AMI sets the OS, and EBS sets the disk. A VPC is a private IP network, and a subnet inside it becomes public if its route table points to an IGW and private if it points to a NAT. The IGW is free, but the NAT Gateway is billed both per hour and per GB, so bypass it with VPC endpoints.

Next chapter #

In this chapter we’ve laid out the picture of launching a single EC2. Next, Chapter 9 EC2 Operations moves to the everyday tools for handling that EC2 safely. We’ll cover Security Group rule design, the limits of key pairs and migration to SSM Session Manager, and how to harden the skeleton with an AMI.

X