Security Basics — MFA, Key Rotation, Least Privilege
Enforcing MFA on root and IAM users, automating access-key rotation, checking permissions with IAM Access Analyzer, least-privilege patterns, and common incident cases — the security guardrails that hold up in operations.
We saw the permission model in Chapter 2 IAM and SSO login in Chapter 5 CloudShell and SSO. This chapter sorts out, on top of those, the security guardrails that hold up in operations. It is the last axis of the first setup you must complete right after sign-up.
90% of AWS security incidents are one of these: a root or user password phished and stolen because there was no MFA, an access key exposed in git / Slack / logs, permissions that are too broad and inflate breach damage, or an incident that goes undiscovered because CloudTrail / GuardDuty was off.
Put guardrails around these four and the incident rate drops sharply. This chapter builds those guardrails in order.
MFA — the single most important thing #
Authentication that ends with one factor, a password, isn’t the 2026 operational level. One phish and the password is gone. MFA (Multi-Factor Authentication) additionally requires a 6-digit code from a second factor, usually a phone app.
MFA kinds #
| Kind | What | Recommendation |
|---|---|---|
| Virtual MFA (TOTP) | A phone app (Google Authenticator, 1Password, Authy) | Standard — almost every case |
| Hardware MFA | A USB key like a YubiKey | Root / high-privilege — strongest |
| U2F / WebAuthn | Browser + a hardware key | Elevating operational credentials |
| SMS | Text message | Forbidden (SIM-swap attacks) |
Hardware MFA is ideal for root, and virtual MFA is plenty for regular users.
Enabling root user MFA #
Do this immediately as your first task right after sign-up.
Console (root login) → top-right user menu → Security credentials
→ Multi-factor authentication (MFA) → Assign MFA device
→ choose Virtual MFA / Hardware MFA
→ scan the QR code with the phone app
→ enter two consecutive codes (the app gives a new code every 30 seconds)After this, every root login requires a password + a 6-digit code.
Enforcing MFA on IAM users #
Turning it on for root alone isn’t enough. You have to enforce it on all IAM users. There are two ways.
Method 1: with a policy — “deny almost every action unless the session has MFA on.”
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowSelfManageCredentials",
"Effect": "Allow",
"Action": [
"iam:ChangePassword",
"iam:CreateVirtualMFADevice",
"iam:EnableMFADevice",
"iam:GetUser",
"iam:ListMFADevices",
"iam:ResyncMFADevice"
],
"Resource": [
"arn:aws:iam::*:user/${aws:username}",
"arn:aws:iam::*:mfa/${aws:username}"
]
},
{
"Sid": "DenyAllExceptListedIfNoMFA",
"Effect": "Deny",
"NotAction": [
"iam:CreateVirtualMFADevice",
"iam:EnableMFADevice",
"iam:GetUser",
"iam:ListMFADevices",
"iam:ResyncMFADevice",
"iam:ChangePassword",
"sts:GetSessionToken"
],
"Resource": "*",
"Condition": {
"BoolIfExists": { "aws:MultiFactorAuthPresent": "false" }
}
}
]
}Attach this policy to the group all users belong to and they can do essentially nothing without MFA. Only MFA registration is the exception.
The flow for forcing MFA registration on first login #
When a new IAM user first logs in, the policy leaves them able to do nothing but register MFA. Once they register their own MFA, they use it normally from then on. This is the standard first-login flow.
Access-key rotation — 90 days is the standard #
The access keys we saw in Chapter 4 CLI and SDK accumulate exposure risk over time.
Rotation policy #
| Item | Recommended rotation cycle |
|---|---|
| User access key | 90 days |
| CI / CD key | 60 days (or switch to OIDC) |
| Service-account key | 30 ~ 60 days |
| Temporary credentials (SSO / Role) | No rotation needed (auto short-lived issuance) |
Rotation procedure #
Use IAM’s trait that you can have up to two keys at once.
# 1) Issue a new key (now 2 active keys)
aws iam create-access-key --user-name curtis
# 2) Replace with the new key in every environment (CI env vars, ~/.aws/credentials, etc.)
# 3) Monitor for a few days — any use of the old key?
# Check with CloudTrail or the IAM credential report
# 4) Deactivate the old key (don't delete yet — for rollback)
aws iam update-access-key --user-name curtis --access-key-id AKIA-OLD --status Inactive
# 5) Monitor a week more → if really unused, delete
aws iam delete-access-key --user-name curtis --access-key-id AKIA-OLDIAM Credential Report — checking rotation #
See every user’s key / MFA / activity status in one CSV.
aws iam generate-credential-report
aws iam get-credential-report --query Content --output text | base64 -d > report.csvuser
mfa_active
access_key_1_active
access_key_1_last_rotated
access_key_1_last_used_date
access_key_2_active
access_key_2_last_rotated
password_last_usedKeys over 90 days old, users without MFA on, and users unused for a month show at a glance.
When you find an exposed key #
If you pushed a key in code, the time until a bot finds it is usually measured in minutes.
What to do immediately (in time order) #
aws iam update-access-key --user-name <user> --access-key-id <key-id> --status Inactiveaws iam create-access-key --user-name <user># Console → CloudTrail → Event history
# search by AccessKeyId → any unintended useaws iam delete-access-key --user-name <user> --access-key-id <key-id># Remove the key from history with BFG Repo-Cleaner or git filter-repo
# (if already pushed, cleaning history alone isn't safe — deleting the key matters more)Tools AWS helps with automatically #
AWS scans public repositories like GitHub, and when it finds an exposed key, immediately emails the user and, in some cases, automatically deactivates the key and attaches a policy. But this is a secondary safety net. Finding it yourself is faster.
IAM Access Analyzer — finding too-broad permissions #
A tool that analyzes your account’s policies and resource policies (S3 bucket policies, KMS key policies, etc.) to find parts accessible externally. It’s free.
Activation #
Console → IAM → Access Analyzer → Create analyzer
- Type: Account / Organization
- Name: my-account-analyzerWithin 24 hours of activation, the list of externally accessible resources appears.
What it catches #
| Resource | Risk |
|---|---|
| S3 bucket — public read | Anyone reads objects |
| S3 bucket — permission to another account | Need to confirm it’s intended |
| KMS key — external use | Encryption key exposure |
| IAM Role — external trust | Another account can borrow it |
| Lambda — external invoke permission | Anyone can invoke |
| RDS snapshot / SQS / SNS / Secrets Manager / EBS / ECR — public | Data / message exposure |
Policy validation #
When you make a new policy, Access Analyzer shows recommendations too. Unused permissions, too-broad wildcards, extra recommendations for the no-condition case, and so on.
Action Last Accessed — finding unused permissions #
For each IAM user / role, it shows the action last used. Permissions unused for 90 days are candidates to narrow.
aws iam generate-service-last-accessed-details --arn arn:aws:iam::123:role/MyRole
# (after a moment)
aws iam get-service-last-accessed-details --job-id ...Least privilege — patterns that work #
“Only as much as needed, only where needed.” The ideal is easy; the problem is how to practice it in operations.
Pattern 1: start broad → narrow #
Crafting perfect permissions from the start is hard. The following flow is realistic.
- Start with
PowerUserAccess/ a service’s*FullAccess. - After a week of operation, check Access Analyzer’s Action Last Accessed.
- Remove unused services / actions.
- Narrow wildcards to ARNs.
- Add conditions.
Repeat this cycle every quarter.
Pattern 2: separate users ↔ roles #
Reconfirm the pattern from Chapter 2 IAM.
- People via SSO (Chapter 5 CloudShell and SSO).
- Machines via Roles (instance profile, execution role, OIDC).
- CI/CD via OIDC + Role (GitHub Actions, GitLab).
A structure where permanent access keys almost disappear.
Pattern 3: Permission Boundary #
“Whatever this user does, only within this limit.” Give a junior developer IAM permissions but block the incident of them creating a new policy or user to raise their own permissions.
{
"Effect": "Allow",
"Action": [
"ec2:*",
"s3:*",
"rds:*",
"logs:*"
],
"Resource": "*",
"Condition": {
"StringEquals": { "aws:RequestTag/env": "dev" }
}
}A user with this boundary attached can’t create anything but dev-environment resources, even with their own policy.
Pattern 4: environment separation #
- Account separation (Organizations) — the strongest line of defense, covered in Chapter 29 security governance.
- VPC separation — separate prod VPC and dev VPC even in the same account (Chapter 8 EC2 and VPC).
- Tag separation — separate permissions and cost with the
env=prodtag (Chapter 3 cost management’s tag strategy).
Pattern 5: break-glass role #
Normally ReadOnly only, briefly elevated to Admin on an incident. Split the SSO Permission Set into two.
| Item | Normal | On an incident |
|---|---|---|
| Use | ReadOnly | Break-glass-Admin (1 hour) |
| Alert | — | Auto-alert to a Slack channel |
| Audit | — | All actions recorded in CloudTrail |
CloudTrail — who did what #
CloudTrail records every API call inside the account. It’s auto-enabled right after sign-up and you can view the last 90 days of events for free. For operations, create a Trail and store it permanently in S3.
A check right after sign-up #
Console → CloudTrail → Trails
→ create one Multi-region Trail (the operational standard)
- Name: my-trail
- Storage: a new S3 bucket
- Log file SSE-KMS encryption: on
- Log file validation: on (tamper detection)CloudTrail’s two kinds of events #
| Kind | What | Cost |
|---|---|---|
| Management events | API calls (RunInstances, DeleteBucket, etc.) | Free (once) |
| Data events | S3 object access, Lambda invocations, etc. | Paid (high volume) |
Most Trails turn on management only. Turn on data events only for specific buckets / functions.
Commonly used queries #
Search in the console’s Event history.
- AccessKeyId — traces of exposed-key use
- UserName — what one user did
- EventName —
ConsoleLogin,DeleteBucket,RunInstances - Time range
SQL analysis with CloudTrail Lake or Athena is also possible (large organizations).
GuardDuty — automated threat detection #
GuardDuty analyzes CloudTrail / VPC Flow Logs / DNS logs with machine learning to catch suspicious activity.
Examples it catches #
| Pattern | Description |
|---|---|
| EC2 communicating from an abnormal region | Intrusion / C&C |
| An access key used from an abnormal region | Use after key exposure |
| Cryptocurrency mining traffic | Mining after intrusion |
| EC2 doing abnormal port scans | Lateral movement |
| Communication with a Tor exit node | Suspicious traffic |
| Abnormal behavior by an IAM user | Credential theft |
Activation #
Console → GuardDuty → Get started → Enable
- 30-day free trial
- After the free trial, charged by data volume (usually $10~50 / month / small operation)At the operational stage, always turn it on. The incidents it catches are large relative to the price.
Security Hub — consolidating security posture #
It gathers the results of several security tools in one place. It automatically runs standard checks like the CIS Benchmark and AWS Foundational Security Best Practices.
Console → Security Hub → Enable
- turn on all recommended standards
- GuardDuty / Access Analyzer / Inspector results are consolidatedRight after sign-up it’s slightly overkill. Turn it on after a quarter, once resources reach a certain level.
Common incident cases #
Case 1: pushing an access key to GitHub → cryptocurrency mining #
The most common scenario. A bot finds it within minutes, and cost accumulates within hours.
Response — immediately deactivate the key → new key → traces of use via CloudTrail → if possible, isolate the account itself. In a billing dispute, contacting AWS support gets most free-tier incidents retroactively waived (but not repeats).
Prevention — pre-commit hooks (gitleaks, truffleHog), GitHub secret scanning, not keeping the key locally at all (SSO).
Case 2: a too-broad S3 bucket policy #
Trying to make it “so everyone at our company can see it,” adding Principal: "*" makes it public read and it gets indexed by search engines.
Response — Access Analyzer catches it. S3 Block Public Access enforces it at the account / bucket level.
Prevention — always turn on S3 Block Public Access; when public is truly needed, the CloudFront + Origin Access Control pattern.
Case 3: phishing a root password without MFA #
A fake “AWS billing alert” email → a fake login page → entering the password. Without root MFA, you’re breached right there.
Response — change the password immediately, check all resources, contact AWS support.
Prevention — root MFA mandatory (ideally hardware), day-to-day work via IAM / SSO.
Case 4: a departed employee’s access key still alive #
After offboarding, the IAM user / key remains as-is and an incident happens months later.
Response — immediately disable / delete the user + check the Trail.
Prevention — include IAM cleanup in the offboarding checklist. With SSO, disabling in the IdP is the end of it.
Case 5: CloudTrail is off #
You go to start a breach investigation and there are no logs from the time of the incident. Someone intentionally turned off the Trail, or it was never on to begin with.
Response — it’s already too late. Partially restore with other available logs (CloudWatch, GuardDuty findings).
Prevention — Trail enabled + Log file validation + S3 Object Lock. Deny disabling CloudTrail itself with an SCP (Service Control Policy).
Common pitfalls #
- Turning on MFA and stopping there — even with MFA on, if an access key is exposed the key still works (a CLI call is unrelated to MFA, Chapter 4’s credential chain). Mind the keys too, or reduce keys themselves with SSO.
- Promising rotation but not doing it — automate periodic checks with the Credential Report. Send keys over 90 days old as a Slack alert.
- Broad allowance of
iam:PassRole—iam:PassRole = *is effectively privilege escalation. Narrow it to a role ARN. - Conditions too strong → you lock yourself out — you allowed only the company IP with
aws:SourceIpbut you’re out of the office. Put a console IP guard last and keep a bypass (VPN / an emergency user). - Turning off GuardDuty — turning it off “because the cost is a waste,” one incident is bigger than a year of GuardDuty cost. In operations, always keep it on.
- Not looking at Security Hub findings — turning it on but not looking at alerts is meaningless. Do a weekly review, or send Slack alerts via EventBridge.
Exercises #
- Based on the table in §“MFA — the single most important thing,” write in one paragraph which MFA kind you use for root and for a regular user respectively, and why SMS is forbidden.
- Without looking, write the 5-step procedure in §“Access-key rotation.” Explain, connecting to Chapter 4’s credential chain, why at step 4 you only deactivate rather than immediately delete the old key.
- Pick one of the five patterns in §“Least privilege — patterns that work” and pair it with which case in §“Common incident cases” it prevents.
In short: Almost all AWS security incidents come from absent MFA, exposed keys, excessive permissions, or absent logging. Enforce MFA on all users, rotate access keys every 90 days, narrow broad permissions with IAM Access Analyzer, and reduce permanent keys with people on SSO and machines on Roles. Keep CloudTrail and GuardDuty on, and you discover incidents in time.
Next chapter #
This is where the right-after-sign-up setup ends. In the next Chapter 7 CloudWatch intro, we sort out CloudWatch, the eye of all operations. We cover the makeup of Logs / Metrics / Alarms / Dashboards, log groups and retention, Metric Filters, and the basics of Logs Insights queries.