AWS Intermediate #3: S3 — static hosting and presigned URLs
If EC2 (#1 ~ #2) is the compute layer, S3 (Simple Storage Service) is AWS’s object storage layer. Launched in 2006 as AWS’s very first service, it’s the oldest service and still one of the most used.
S3 is essentially “an infinitely large global file system (where directories are fake)”. 11 9’s (99.999999999%) durability, ~$0.023 per GB, and the data hub of every other AWS service — these three are S3’s identity.
This post threads S3’s shape → policies and security → static hosting → presigned URLs → storage classes.
Buckets and objects #
S3 has only two things:
- Bucket — container that holds objects. Per account / per region
- Object — the actual file. Identified by key
my-bucket/ ← bucket (globally unique name)
images/
profile/2026/avatar-001.jpg ← object (key = full path)
profile/2026/avatar-002.jpg
videos/
intro.mp4
index.htmlThere are no directories, really. The / in the picture is part of the key. images/profile/2026/avatar-001.jpg is the full key of one object. The console just renders it like folders by splitting on /.
Global uniqueness of bucket names #
The bucket name has to be unique across every AWS account in the world. Plain names like my-bucket are long taken.
my-company-dev-uploads-2026
acme-prod-static-ap-northeast-2Rules:
- 3–63 chars, lowercase / digits /
-/. - Dots (
.) are allowed but break with SSL wildcard certificates → usually-only - No IP-address-like name, no
xn--start (Punycode) - No uppercase, no underscore
Encoding environment / purpose / region / company in the name makes the bill / search easier later.
Buckets are regional #
Bucket names are global, but the data lives in one region. Create in ap-northeast-2 (Seoul) and the data sits in Seoul. The console shows you which region.
Cross-region replication is configured explicitly via S3 Replication (CRR — Cross Region Replication).
Core attributes of an object #
Each object carries:
| Attribute | Description |
|---|---|
| Key | The object’s full path. Unique within the bucket |
| Body | The actual data (up to 5TB) |
| Content-Type | How the browser interprets it (image/jpeg, application/json) |
| Metadata | User-defined headers (x-amz-meta-*) |
| ACL | Per-object permission (rarely used today, replaced by bucket policy) |
| Storage Class | Storage class (Standard, IA, Glacier, etc.) |
| Version ID | Version identifier if versioning is on |
| ETag | Content hash (mostly MD5) |
Upload via console / CLI / SDK #
# Single file
aws s3 cp ./image.jpg s3://my-bucket/images/profile/avatar.jpg
# Sync a whole folder
aws s3 sync ./public s3://my-bucket --delete
# Specify Content-Type
aws s3 cp ./index.html s3://my-bucket/ --content-type "text/html; charset=utf-8"
# Download
aws s3 cp s3://my-bucket/data.json ./import boto3
s3 = boto3.client("s3")
s3.upload_file("image.jpg", "my-bucket", "images/avatar.jpg")
s3.download_file("my-bucket", "data.json", "data.json")The four layers of security #
S3 security stacks four layers. Priority (top is stronger):
1. Public Access Block ← strongest. Block decisions trump everything
2. SCP (Organizations) ← account-level guard
3. IAM Policy ← per user / role
4. Bucket Policy ← per bucket
5. Object ACL ← per object (legacy approach, rarely used)Public Access Block — first line #
Public Access Block (PAB) is the safety net to keep buckets from accidentally going public. Four options:
| Option | Meaning |
|---|---|
| BlockPublicAcls | New ACLs can’t be public |
| IgnorePublicAcls | Existing public ACLs are ignored |
| BlockPublicPolicy | New bucket policies can’t be public |
| RestrictPublicBuckets | Even already-public buckets only allow IAM Principals |
The default for every new bucket today is all four turned on at the account level. Only buckets that intend to be public (e.g., static hosting) explicitly relax these.
aws s3control put-public-access-block \
--account-id 123456789012 \
--public-access-block-configuration \
BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=trueBucket Policy — JSON policy #
A bucket policy is a JSON policy attached directly to a bucket. It says who (Principal) can do what (Action) where (Resource).
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "logdelivery.elasticloadbalancing.amazonaws.com"
},
"Action": "s3:PutObject",
"Resource": "arn:aws:s3:::my-alb-logs/*",
"Condition": {
"StringEquals": {
"s3:x-amz-acl": "bucket-owner-full-control"
}
}
}
]
}{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::123456789012:role/MyAppRole"
},
"Action": ["s3:GetObject", "s3:ListBucket"],
"Resource": [
"arn:aws:s3:::my-bucket",
"arn:aws:s3:::my-bucket/*"
]
}
]
}IAM Policy #
The policy attached to IAM users / roles. Combined with the bucket policy — both have to allow for a cross-account Allow to take effect.
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:PutObject"],
"Resource": "arn:aws:s3:::my-bucket/uploads/*"
}]
}For IAM details, see Basics #2.
Static site hosting #
S3 can host static HTML / CSS / JS as is. The simplest way to host a static site.
aws s3 website s3://my-static-site/ \
--index-document index.html \
--error-document 404.htmlAfter this, the bucket responds at:
http://my-static-site.s3-website-ap-northeast-2.amazonaws.comAllowing public access #
PAB blocks it by default. Static hosting is intentionally public, so:
- Disable the two BlockPublicPolicy items in the bucket’s PAB
- Allow GetObject for everyone via a bucket policy:
{
"Version": "2012-10-17",
"Statement": [{
"Sid": "PublicReadGetObject",
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::my-static-site/*"
}]
}Limits of S3 static hosting #
S3 alone can’t do:
- HTTPS (the S3 website endpoint is HTTP)
- Custom domain + SSL certificate directly
- Edge cache (fast worldwide responses)
That’s why production almost always uses the S3 + CloudFront pattern — covered in #7 CloudFront. At that point you turn PAB back on and let only CloudFront in via OAC.
Presigned URL — temporary permission #
A presigned URL lets you say “anyone can download / upload this object for the next N minutes.” It is a pattern for temporarily delegating permission to users who have none.
The most common use cases:
- User profile image upload — the client PUTs to S3 directly
- Receipt download — a 5-minute link
- Private video streaming — 1-hour token
import boto3
s3 = boto3.client("s3")
url = s3.generate_presigned_url(
"put_object",
Params={
"Bucket": "my-bucket",
"Key": f"uploads/user-123/{filename}",
"ContentType": "image/jpeg",
},
ExpiresIn=600, # 10 minutes
)
# The client PUTs to this URLcurl -X PUT --upload-file ./photo.jpg "<presigned-url>"Security of presigned URLs #
- The URL itself carries temporary credentials. Anyone with the URL can use it
- It expires automatically after the expiry
- Use HTTPS only. HTTP exposes it
- You can pin conditions like ContentType / Content-Length
POST form vs PUT URL #
Two upload modes:
- PUT URL — simple. Metadata via headers, one fixed ContentType
- POST form (presigned post) — complex. Multiple conditions (
content-length-range,starts-with, …) for stronger safety
Big / important uploads should use POST form. Simple cases use PUT URL.
Versioning and lifecycle #
Versioning — object history #
Turning on versioning preserves earlier versions automatically when you PUT the same key again.
aws s3api put-bucket-versioning \
--bucket my-bucket \
--versioning-configuration Status=EnabledAfter enabling:
- Even Delete doesn’t really delete — only adds a Delete Marker
- Recover earlier versions with
--version-id - Storage cost is the sum across all versions ← trap
Lifecycle — auto cleanup / transition #
Rules to automatically move old objects to cheaper classes or delete them.
{
"Rules": [{
"ID": "ArchiveOldLogs",
"Status": "Enabled",
"Filter": { "Prefix": "logs/" },
"Transitions": [
{ "Days": 30, "StorageClass": "STANDARD_IA" },
{ "Days": 90, "StorageClass": "GLACIER" }
],
"Expiration": { "Days": 365 }
}]
}In production, lifecycle is essentially required — without it the bill becomes scary in 6 months.
Storage classes — the cost lever #
The same data can live in different classes based on how often / how fast you read it, and the savings are big.
| Class | $/GB/mo | Frequent access | Retrieval | Use |
|---|---|---|---|---|
| Standard | $0.023 | Daily | Instant | Default. Hot data |
| Standard-IA | $0.0125 | Sometimes | Instant | Backups, analysis |
| One Zone-IA | $0.01 | Sometimes, recreatable | Instant | One AZ — low criticality |
| Intelligent-Tiering | Auto | Pattern unknown | Instant | When access frequency is uneven |
| Glacier Instant Retrieval | $0.004 | Quarterly | Instant | Archive + need instant sometimes |
| Glacier Flexible Retrieval | $0.0036 | 1–2x per year | Min–hours | General archive |
| Glacier Deep Archive | $0.00099 | Almost never | 12 hours | Long-term compliance |
Numbers are approximate, ap-northeast-2. See the official pricing page for details.
Class decision guide #
Daily / weekly access?
├── YES → Standard
└── NO →
Sometimes (≤1x/month)?
├── YES → Standard-IA (One Zone-IA if recreatable)
└── NO →
Pattern predictable?
├── YES → Glacier family
└── NO → Intelligent-TieringTrap — class transition cost #
Each transition like Standard → IA costs ~$0.01 per object. With 100M objects, that adds up fast — don’t bounce around with lifecycle rules. Decide based on object size / access frequency.
S3 consistency #
In the old days, read-after-write consistency was weak. Since December 2020, every region has strong consistency:
- GET right after PUT works
- LIST reflects DELETE immediately
However, versioned objects and metadata changes can still have a slight lag.
S3 with other services #
| Common companions | Pattern |
|---|---|
| CloudFront | S3 + edge cache + custom domain (#7) |
| Lambda | S3 PUT trigger for image processing / indexing (Advanced #3) |
| Athena | SQL on CSV / Parquet / JSON in S3 |
| Glue | S3 data catalog / ETL |
| CloudTrail / VPC Flow Logs / ALB Logs | All stored in S3 |
Common pitfalls #
1) Bucket inadvertently public #
Half the data leaks in the news involve S3. New buckets start with all 4 PAB flags on, only intentionally-public buckets like static hosting explicitly relax them.
2) Cost bomb #
- Per-GB storage + request count + data transfer triple-charged
- Egress to the internet is ~$0.09/GB — for a popular static site that dominates
- Pair with CloudFront to cut egress + accelerate via edge cache (#7)
3) Millions of small files #
Each object incurs GET / PUT cost; millions of tiny files is surprisingly expensive. Bundle (tar.gz, Parquet) or move to DynamoDB.
4) No lifecycle for a year #
Logs / temp files staying on Standard — bill spikes after 6 months. Set up lifecycle on day one.
5) Versioning on, forgotten #
Versioning + no lifecycle = storage cost grows forever. If versioning is on, set lifecycle to clean up non-current versions.
6) Presigned URL expiry too long #
A 24-hour presigned URL is practically permanent. Usually 5–15 minutes, hour at most.
7) s3:* wildcard in IAM
#
Action: "s3:*" is dangerous. List explicit actions like GetObject / PutObject / ListBucket.
Wrap-up #
What we took home this time:
- S3 = infinite object storage. Bucket (globally unique name) + Object (key) are the only two things
- Directories are fake — just part of the key
- Permission evaluation: PAB → IAM Policy → Bucket Policy → ACL
- New buckets start with all 4 PAB flags on
- Static hosting = bucket + website endpoint + public read policy. HTTPS / edge come from #7 CloudFront
- Presigned URL = temporary delegation. 5–15 min, HTTPS only, pin ContentType
- Versioning + lifecycle are a pair. Versioning without lifecycle = bill grows
- Storage classes — Standard / IA / One Zone-IA / Intelligent-Tiering / 3 Glacier flavors
- Pitfalls — public leaks, egress cost, small files, missing lifecycle, versioning cost, expiry too long, wildcard IAM
Next — RDS #
The object piece is set. Now to relational databases.
In #4 RDS — managed DB, backups, parameter groups we’ll line up the managed model, automated backups and PITR, Multi-AZ, parameter / option groups, and how to handle minor vs major upgrades.