31 Chapter

Lambda in Depth — Cold Starts · SnapStart · Packaging · Observability

Adds a production-operations lens on top of Chapter 17's Lambda basics. Covers cold starts and SnapStart · Provisioned Concurrency, packaging with Layers and container images (one full FastAPI cycle), Lambda Powertools-based observability, combining with Step Functions, and the Lambda vs Fargate cost trade-off.

In Chapter 17 Lambda basics we got the basics of creating a function and invoking it from a trigger, and in Chapter 18 API Gateway + Lambda we attached an HTTP front end. This chapter builds on that to cover the deep topics you run into when operating Lambda in production — cold starts, packaging strategy, observability, and when Lambda beats ECS Fargate.

This is Part 5’s last service chapter. The cold starts and cost trade-offs covered here draw a clear boundary between this book’s container-first route (Chapter 15 ECS and Fargate) and the serverless route. They become the basis for judging which parts of the Part 6 capstone fullstack deployment to put on Fargate and which on Lambda.

Cold starts — Lambda’s first-invocation cost #

When there are no invocations, Lambda takes the instance down. When a new invocation arrives, it spins up a fresh execution environment, and this process is the cold start.

The cold start has these phases.

Creating the execution environment (micro VM)
Runtime initialization
Loading function code + running init code (outside the handler)

Afterward, invocations to the same instance are fast as warm starts. Here’s how to reduce cold starts.

Reduce init code — importing heavy libraries or opening large connections outside the handler incurs that cost on every cold start. Keep only what’s essential. (However, keeping a DB connection opened once outside the handler so it’s reused on warm starts is recommended.)
Reduce package size — the smaller the deployment package, the faster it loads.
Choose the language — Python / Node have lightweight initialization, while JVM-family runtimes have long cold starts (consider SnapStart below in that case).
Increase memory — Lambda gets CPU in proportion to memory. Raising memory can also speed up initialization, so sometimes raising memory is both cheaper and faster. Don’t guess the right value — measure and pick it with Lambda Power Tuning.

SnapStart — restoring an initialization snapshot #

SnapStart is a feature that performs the function’s initialization once ahead of time, takes a memory/disk snapshot, and on invocation restores that snapshot to skip the initialization phase. It cuts cold starts significantly at no extra cost.

Supported runtimes: Java 11 and above, Python 3.12 and above, .NET 8 and above. (Node.js · Ruby and others are unsupported.)
It does not apply to container images. It works only with the zip-packaging + supported-runtime combination. In other words, if you choose the “container image” packaging in §“Packaging” below, you can’t use SnapStart, so if cold starts matter, the packaging choice is affected from the start.
Since initialization runs only once at snapshot time, you must not freeze random seeds · timestamps · one-time tokens in the initialization phase. Re-create such values at restore time with runtime hooks (register_before_snapshot / register_after_restore).

Python SnapStart runtime hook — re-create the connection on restore

from snapshot_restore_py import register_after_restore

@register_after_restore
def reinit_db():
    global conn
    conn = create_db_connection()   # connect anew right after restore, not at snapshot time

SnapStart and Provisioned Concurrency both reduce cold starts, but SnapStart has no extra charge (restore is free) while PC bills for the amount you keep running. So on a supported runtime, review SnapStart first, and layer PC onto the ultra-low-latency paths where SnapStart alone isn’t enough.

Provisioned Concurrency #

For latency-sensitive paths where you must eliminate cold starts entirely (payments, user-facing APIs), use Provisioned Concurrency (PC). It keeps the specified number of execution environments pre-initialized and warm.

Alias + Provisioned Concurrency + auto-scaling with traffic

resource "aws_lambda_alias" "live" {
  name             = "live"
  function_name    = aws_lambda_function.api.function_name
  function_version = aws_lambda_function.api.version
}

resource "aws_lambda_provisioned_concurrency_config" "live" {
  function_name                     = aws_lambda_function.api.function_name
  qualifier                         = aws_lambda_alias.live.name
  provisioned_concurrent_executions = 5
}

Upside: cold starts disappear.
Downside: you’re billed even with no invocations for the amount you’ve pre-warmed. Once steady traffic is above a certain level, this is exactly the range where ECS Fargate is cheaper than Lambda. Scheduling the PC amount by time of day with Application Auto Scaling can cut nighttime cost.

Decision rule: if traffic is intermittent, Lambda (pay-per-request) wins; if it flows steadily above a certain amount, Fargate (always-on billing but a lower unit price) wins. If you need a lot of PC, that’s a signal to consider moving to Fargate.

Packaging — Layers and container images #

There are three ways to bundle Lambda code.

Method	Limit	SnapStart	Fit
Inline / zip upload	50MB zipped, 250MB unzipped	Available	small functions
Layers	a zip with shared dependencies split out	Available	many functions sharing the same library
Container image	up to 10GB	Unavailable	large dependencies, reusing an existing Docker workflow

Layers #

If many functions use the same library (e.g., a common utility, an SDK extension, Powertools), split it into a Layer. The function package gets lighter, and the library is managed once. Note that Layers also count toward the 250MB unzipped limit together.

Container image — one FastAPI cycle #

If your dependencies are large (ML libraries, etc.) or your team already builds with Docker, a container image is clean. You can reuse the container workflow from Chapter 15 ECS · Chapter 16 ECR directly on Lambda. But as seen above, container images can’t use SnapStart, so for cold-start-sensitive functions consider zip + a supported runtime. One cycle of putting a FastAPI app on a Lambda container looks like this.

Dockerfile — FastAPI on Lambda

FROM public.ecr.aws/lambda/python:3.13

COPY requirements.txt .
RUN pip install -r requirements.txt

COPY app/ ${LAMBDA_TASK_ROOT}/app/
# Adapt ASGI (FastAPI) to a Lambda handler with Mangum
CMD ["app.main.handler"]

app/main.py — Mangum adapter

from fastapi import FastAPI
from mangum import Mangum

app = FastAPI()

@app.get("/health")
def health():
    return {"status": "ok"}

handler = Mangum(app)  # converts API Gateway events to ASGI

Build·push to ECR, then create the Lambda

docker build -t myapp-lambda .
docker tag myapp-lambda:latest \
  123456789012.dkr.ecr.ap-northeast-2.amazonaws.com/myapp-lambda:latest
docker push 123456789012.dkr.ecr.ap-northeast-2.amazonaws.com/myapp-lambda:latest

aws lambda create-function \
  --function-name myapp \
  --package-type Image \
  --code ImageUri=123456789012.dkr.ecr.ap-northeast-2.amazonaws.com/myapp-lambda:latest \
  --role arn:aws:iam::123456789012:role/myapp-lambda-role

Put the HTTP API from Chapter 18 API Gateway + Lambda in front and the FastAPI app runs serverless. Since we put the same FastAPI app on ECS Fargate in Chapter 22 infrastructure skeleton, comparing the two makes the “serverless vs always-on container” trade-off clear.

Observability — Lambda Powertools #

Lambda is short-lived and distributed, which makes debugging hard. Lambda Powertools standardizes three things.

Structured logging — automatically attaches the request ID · whether it was a cold start, etc., to JSON logs so they query well in Chapter 7 CloudWatch Logs Insights.
Metrics — leaves business metrics in EMF (Embedded Metric Format) so they auto-aggregate into CloudWatch metrics.
Tracing — integrates with X-Ray to trace the single-request flow API Gateway → Lambda → DynamoDB (Chapter 26 monitoring — CloudWatch alarms and X-Ray).

Powertools — logging · metrics · tracing in one go

from aws_lambda_powertools import Logger, Tracer, Metrics
from aws_lambda_powertools.metrics import MetricUnit

logger = Logger()
tracer = Tracer()
metrics = Metrics()

@logger.inject_lambda_context     # auto-attach request ID & cold-start flag
@tracer.capture_lambda_handler    # X-Ray segment
@metrics.log_metrics              # flush EMF metrics
def handler(event, context):
    metrics.add_metric(name="OrderCreated", unit=MetricUnit.Count, value=1)
    logger.info("order created", extra={"order_id": event["id"]})
    return {"statusCode": 201}

Combining with Step Functions #

As we saw in Chapter 21 Step Functions, instead of cramming many steps into one Lambda with try/except, split them into a state machine. Here are the combination patterns from the Lambda-in-depth angle.

Keep each step as a small Lambda and let Step Functions handle ordering · retries · branching.
Turn on SnapStart or Provisioned Concurrency only for the steps where cold starts are a problem.
Handle long waits (external approval, etc.) with Step Functions’ wait states (Wait / callback tokens) instead of keeping a Lambda running, to save cost. Lambda’s maximum execution time is 15 minutes, so any flow longer than that must be split with Step Functions.

Lambda vs Fargate — a one-line decision #

Signal	Choice
Intermittent / event-driven traffic	Lambda (pay-per-request)
Steady traffic above a certain amount	Fargate (lower unit price)
Work over 15 minutes	Fargate (Lambda maxes at 15 minutes)
Cold-start sensitive + steady traffic	Fargate (or weigh the SnapStart / PC cost)
Fast spike response	Lambda (instant scaling)
Large dependencies (10GB image)	Lambda container or Fargate (but Lambda containers can’t use SnapStart)

Exercises #

Pick one of your workloads and judge whether Lambda or ECS Fargate fits using the signals in the §“Lambda vs Fargate” table, and write the rationale in one paragraph. Also write how that conclusion changes if you’re in a situation where you’d have to turn on a lot of Provisioned Concurrency.
You have a cold-start-sensitive Python function. Answer whether you can use SnapStart and container-image packaging at the same time, citing the §“SnapStart” and §“Packaging” tables, and write which of the two you’d have to give up and why.
A function with SnapStart on suffers from “the DB connection opened during initialization is severed after restore.” Write how you solve it with the runtime hook from §“SnapStart” as a code flow.

In short: Lambda’s cold start is the cost of creating the execution environment, runtime, and init code, and you reduce it by shrinking initialization, shrinking the package, and adding memory. On a supported runtime (Java 11+/Python 3.12+/.NET 8+, but not container images), use the no-extra-charge SnapStart first, and layer the billed Provisioned Concurrency onto ultra-low-latency paths where that still isn’t enough. If you need a lot of PC, that’s a signal that Fargate is cheaper. Packaging comes as zip / Layers / container image (up to 10 GB, no SnapStart, a full cycle with FastAPI + Mangum), and observability standardizes structured logging, EMF metrics, and X-Ray tracing with Lambda Powertools. Split flows longer than 15 minutes with Step Functions, and the general rule is Lambda for intermittent traffic, Fargate for steady traffic.

Next Chapter #

We finish Part 5. In the next Chapter 32 Deploying a fullstack app on AWS, we weave all the services from Chapters 1 ~ 31 into one. We deploy modern-react’s Next.js app and modern-python’s FastAPI app on one account with ECS Fargate + RDS + S3 + CloudFront + Terraform, and apply the “where Fargate, where Lambda” judgment organized in this chapter to a real system.