Contents
17 Chapter

Lambda Basics

The first button of AWS serverless. We cover Lambda's role (vs ECS / EC2), the runtime / handler / event / context model, synchronous / asynchronous / stream invocation, concurrency and cold starts, Reserved / Provisioned Concurrency, memory · time limits, logging and Layers, and cost.

Chapter 15 ECS and Fargate and Chapter 16 ECR were models where containers are always running. Even when traffic is 0, one container stays alive. When traffic varies a lot, when you only need short processing, or when you want to cut the operational burden further, a different option fits — Lambda.

This chapter puts together, all at once, how Lambda works, its model (runtime / handler / event), invocation methods, cold starts, concurrency and limits, and logging. The model we set up here leads into the HTTP exposure of Chapter 18 API Gateway + Lambda, the event processing of Chapter 19 EventBridge / SQS / SNS, and the operational patterns of Part 5’s Chapter 31 Lambda in depth.

What Lambda does #

AWS Lambda is a serverless function execution platform. When an event arrives, the function wakes up only then, and once it finishes it disappears again. When traffic is 0, cost is 0 too.

The picture of Lambda
event (HTTP / S3 upload / SQS message / Cron)
Lambda brings up a container hot or cold
my handler function runs (a few ms ~ 15 min)
response / result → caller
after some idle time, the container terminates

When Lambda fits #

Cases where Lambda fits #

  • Event-driven — S3 upload → thumbnail generation, SQS message → processing, EventBridge schedule → batch
  • Highly variable traffic — usually 0, occasionally spiking. No cost at 0
  • Short processing — usually seconds ~ minutes
  • Side / auxiliary workloads — helper functions beside the main system
  • Just part of an API — not every API has to be Lambda

Cases where Lambda doesn’t fit #

CaseReason
A large API with constant trafficECS is better on concurrency / cold start / cost
Processing longer than 15 minutesOne Lambda invocation is 15 minutes max
Very large memory / GPULambda is 10 GB memory max, no GPU
Stateful connections (a WebSocket backend, etc.)Possible but complex to design
An always-on DB connection poolA model that connects anew per invocation

Comparison #

EC2ECS / FargateLambda
Time runningAlwaysAlways (Service)Per invocation
Operational burdenLargeMedium (Fargate small)Small
Cold startNoneSmallYes (tens of ms ~ a few sec)
Time limitUnlimitedUnlimited15 min
Cost at 0 trafficLargeMedium0
Concurrent processingOS levelMultiple in one container1 concurrent per function instance

The last line matters. A single Lambda function instance processes only one invocation at a time. If there are N concurrent invocations, instances auto-scale to N as well.

First Lambda — Hello, World #

Creating a function (console) #

Console → Lambda → “Create function” → Author from scratch → proceed with Python 3.13.

lambda_function.py (the console default)
def lambda_handler(event, context):
    return {
        "statusCode": 200,
        "body": "Hello, Lambda"
    }

Save and press “Test,” and it’s invoked with an empty event. A log group (/aws/lambda/<function-name>) is automatically created in CloudWatch Logs.

Creating it with the CLI #

Bundle the code into a zip.

zip + create
zip function.zip lambda_function.py

aws lambda create-function \
  --function-name hello \
  --runtime python3.13 \
  --role arn:aws:iam::123456789012:role/lambda-basic-role \
  --handler lambda_function.lambda_handler \
  --zip-file fileb://function.zip

IaC (Terraform) #

terraform
resource "aws_lambda_function" "hello" {
  function_name = "hello"
  runtime       = "python3.13"
  handler       = "lambda_function.lambda_handler"
  role          = aws_iam_role.lambda.arn
  filename      = "function.zip"
  source_code_hash = filebase64sha256("function.zip")

  memory_size = 256
  timeout     = 10
}

Terraform itself is covered in earnest in Chapter 25 Terraform intro.

The model — Runtime / Handler / Event / Context #

We put Lambda’s model together with four keywords.

Runtime #

The language and version. The managed runtimes are as follows.

  • Python (3.10 ~ 3.13)
  • Node.js (18, 20, 22)
  • Java (8, 11, 17, 21)
  • .NET, Ruby, Go (on provided.al2023)
  • Custom Runtime — supports Rust / Zig / Swift, etc. directly

Or deploy as a container image (ECR) — wires naturally into Chapter 16 ECR. Advantageous when you have large dependencies (zip limit 250MB, containers 10GB).

Handler #

The name of the function Lambda invokes. The form is <filename>.<function-name>.

myapp/handler.py
def main(event, context):
    ...

→ Handler setting: myapp.handler.main

Event #

The data the caller sent. It’s a JSON object, and the shape differs by invocation source.

event from API Gateway
{
  "version": "2.0",
  "routeKey": "GET /hello",
  "headers": {...},
  "queryStringParameters": {...},
  "body": "..."
}
event from S3 ObjectCreated
{
  "Records": [
    {
      "eventSource": "aws:s3",
      "s3": {
        "bucket": {"name": "my-bucket"},
        "object": {"key": "uploads/photo.jpg"}
      }
    }
  ]
}

A library like Lambda Powertools (Python / TypeScript / Java) helps a lot in handling these events type-safely — in practice, make active use of it.

Context #

Runtime information. It holds the function’s execution time limit (get_remaining_time_in_millis()), the request id, the function name, and so on.

the role of context
def handler(event, context):
    print(context.aws_request_id)
    print(context.function_name)
    print(context.get_remaining_time_in_millis())  # remaining time in ms

Invocation methods — sync vs async vs stream #

Lambda’s behavior differs by invocation source.

1) Synchronous #

The caller waits for the result. It blocks until it gets a response.

Invocation source
API Gateway
ALB
Cognito
Direct Invoke API
aws lambda invoke \
  --function-name hello \
  --payload '{"name": "world"}' \
  --cli-binary-format raw-in-base64-out \
  out.json

2) Asynchronous #

The caller puts it in a queue and is done. Lambda processes it in the background. On failure it auto-retries (default 2 times) and forwards to a DLQ (Dead Letter Queue).

Invocation source
S3 ObjectCreated
SNS
EventBridge
Invoke with InvocationType=Event
aws lambda invoke \
  --function-name hello \
  --invocation-type Event \
  --payload '{"key": "value"}' \
  --cli-binary-format raw-in-base64-out \
  out.json

3) Stream / polling #

Lambda automatically polls a queue / stream.

Invocation source
SQS
DynamoDB Streams
Kinesis
MSK (Managed Kafka)

Once set up, Lambda receives and processes new messages in batches as they arrive. On failure it goes to retry and the DLQ. The connection with SQS is covered in detail in Chapter 19 EventBridge / SQS / SNS.

What concurrency means #

This is the single most important thing about Lambda. A single function instance processes only one concurrent invocation, so the number of concurrent invocations = the number of instances brought up.

the flow of concurrent invocations
10 invocations/sec, 1 sec each → ~10 concurrent instances
100 invocations/sec, 100ms each → ~10 concurrent instances

Account concurrency #

The default is 1,000 per region. Operational workloads often run short — request an increase in the Service Quotas console.

Reserved Concurrency #

Guarantees and caps “up to N” for a specific function.

this function is capped at 100
aws lambda put-function-concurrency \
  --function-name hello \
  --reserved-concurrent-executions 100

Its uses are as follows.

  • Block runaway of a dangerous function (e.g., a function calling a paid external API)
  • Leave concurrency headroom for other important functions (so this function can’t take all 1,000)
  • Prevent DB connection runaway (protect the RDS connection pool)

Provisioned Concurrency — avoiding cold starts #

Pre-warms N instances. If concurrent invocations are N or fewer, cold starts are 0.

aws lambda put-provisioned-concurrency-config \
  --function-name hello \
  --qualifier prod \
  --provisioned-concurrent-executions 10

Cost is charged for the time the warmed instances run (at a slightly cheaper rate). Consider it if it’s the entrance to an API and the cold start directly affects user experience.

Cold start — the pitfall you’ll hit most often #

The startup cost of a newly created function instance. It splits into two phases.

the breakdown of a cold start
[INIT phase] — once only
  ├─ prepare the container environment
  ├─ start the runtime
  ├─ import the handler module
  └─ run code outside the handler function (global)
[INVOKE phase] — every invocation
  └─ run the handler function

INIT is a cost only on the instance’s first invocation. If the same instance takes the next invocation, it skips INIT (warm).

Cold start times #

Rough ranges.

Language / formINIT time
Python (small dependencies)~150 ms
Python (large dependencies, e.g., boto3 + pandas)~1 ~ 2 sec
Node.js (small)~100 ms
Java~500 ms ~ a few sec
Container image (large)~a few sec

Reducing cold starts #

  1. Increase memory — Lambda scales vCPU in proportion to memory. Even just 256MB → 1024MB speeds up INIT.
  2. Slim dependencies — drop unused packages, tree-shaking
  3. Use global variables — a warm instance reuses an object created once outside the handler (boto3 client, DB connection, etc.)
  4. Provisioned Concurrency — as seen above
  5. Lambda SnapStart — snapshots the INIT result for fast restore. Currently supported on Java / Python / .NET

The global variable pattern #

good pattern — create the client in the global scope
import boto3

# once per Lambda instance — the INIT phase
s3 = boto3.client("s3")

def handler(event, context):
    # keep the handler clean — don't create the client per invocation
    return s3.list_buckets()
bad pattern
def handler(event, context):
    # a boto3 client per invocation — slow
    s3 = boto3.client("s3")
    return s3.list_buckets()

Memory and time limits #

ItemLimit
Memory128 MB ~ 10,240 MB (in 1 MB steps)
Time1 sec ~ 900 sec (15 min)
Temp disk (/tmp)512 MB ~ 10 GB
Environment variables4 KB
Payload (sync)6 MB
Payload (async)256 KB
zip (source)50 MB (compressed)
zip (uncompressed)250 MB
Container image10 GB

Memory is tied to vCPU. At 1,769 MB you get one vCPU’s worth. Increasing memory often increases CPU too, so the function gets faster — increasing memory alone can lower cost (Lambda Power Tuning finds the optimal value).

Logging — CloudWatch Logs #

All of Lambda’s stdout / stderr automatically goes into the /aws/lambda/<function-name> log group of CloudWatch Logs.

logging
import logging

logger = logging.getLogger()
logger.setLevel(logging.INFO)

def handler(event, context):
    logger.info("Hello %s", event.get("name"))
    return "ok"

In operations we recommend structured logs — output as JSON, and you can query per field in CloudWatch Logs Insights.

JSON logs
import json, logging

logger = logging.getLogger()

def handler(event, context):
    logger.info(json.dumps({
        "request_id": context.aws_request_id,
        "user_id": event.get("user_id"),
        "action": "process",
        "duration_ms": 42
    }))

Powertools’s Logger handles this cleanly in one call.

Layers — code reuse #

When several functions use the same dependencies / utilities, separate them out with a Lambda Layer.

create a Layer
# python dependency
mkdir -p python
pip install requests -t python/
zip -r layer.zip python

aws lambda publish-layer-version \
  --layer-name my-utils \
  --zip-file fileb://layer.zip \
  --compatible-runtimes python3.13
attach to a function
aws lambda update-function-configuration \
  --function-name hello \
  --layers arn:aws:lambda:ap-northeast-2:123456789012:layer:my-utils:1

The advantage is the function zip gets smaller and you update dependencies just once. The drawback is that with too many, tracking gets hard (limit of 5).

Lambda cost #

Lambda cost
cost per invocation = (invocation count × $0.0000002)
                    + (execution time GB-sec × $0.0000166667)

Example of 1M invocations/month + 100ms per invocation + 256MB.

  • Invocation cost: 1M × $0.0000002 = $0.20
  • Time cost: 1M × 0.1 sec × 0.25GB × $0.0000166667 = $0.42
  • Total: ~$0.62 / month

Very cheap. The free tier gives 1M invocations + 400,000 GB-sec per month free. A small workload is effectively 0.

Cases where cost grows are as follows.

  • Functions with long per-invocation time and large memory (e.g., 5 min × 3GB)
  • Functions with always-high concurrency — ECS / Fargate may be cheaper

Pitfalls you’ll often hit #

1) Slow first invocation due to cold start #

The Lambda at the API entrance takes 0.5 ~ 2 seconds each. Users can’t stand it. Consider the five “Reducing cold starts” items above and Provisioned Concurrency.

2) Creating objects every time inside the handler #

bad example
def handler(event, context):
    db = create_db_connection()
    boto = boto3.client("s3")
    ...

A waste of 100ms per invocation. Pull it out to the global scope.

3) RDS connection pool runaway #

100 concurrent Lambdas → 100 DB connections → the DB exceeds its connection limit. The alternatives are as follows.

  • RDS Proxy — a connection pool the Lambdas share (Chapter 11 RDS)
  • Limit the function’s concurrency with Reserved Concurrency
  • Use a serverless DB like DynamoDB

4) An async invocation’s mistake vanishes silently #

Even if S3 → Lambda fails, the caller doesn’t know. Capture failures with a DLQ (SQS) or a Lambda Destination.

async failures to an SQS DLQ
aws lambda put-function-event-invoke-config \
  --function-name hello \
  --maximum-retry-attempts 2 \
  --destination-config '{"OnFailure":{"Destination":"arn:aws:sqs:...:dlq"}}'

5) Cut off by a timeout #

If you put heavy processing into a function not knowing the 15-minute limit, it’s forcibly terminated at 14 minutes 59 seconds. Split long processing with Chapter 21 Step Functions intro or move it to ECS / Fargate.

6) Payload exceeds 6MB #

This is the limit of an API Gateway → Lambda synchronous invocation. Work around large files with the S3 presigned URL pattern — Lambda only issues the presigned URL, and the client uploads to S3 directly.

7) Secrets in environment variables #

If you put a DB password in a plaintext environment variable, it’s exposed in logs / the console. Move it to Chapter 20 Secrets Manager / Parameter Store.

Exercises #

  1. Judge in one sentence whether the processing you want to build belongs to the “fits” or “doesn’t fit” cases in §“When Lambda fits,” and if it doesn’t fit, write which model of Chapter 15 ECS and Fargate you should go to.
  2. Among the five methods in §“Reducing cold starts,” split which can be applied without changing a single line of code and which require changing the code structure. Then connect where the good and bad examples of §“The global variable pattern” make a difference between the INIT / INVOKE phases of a cold start.
  3. Following the calculation in §“Lambda cost,” compute the monthly cost of a function with 5M invocations/month + 200ms per invocation + 512MB. If you run the same workload always-on as Fargate from Chapter 15 ECS and Fargate, judge in one sentence which is cheaper.

In short: Lambda is a serverless function that wakes only when an event arrives, costs 0 when traffic is 0, and fits event-driven, highly variable traffic with short processing. Its model has four parts - runtime, handler, event, and context - and invocation splits into sync, async, and stream. Because a single function instance processes only one concurrent invocation, instances auto-scale as invocations grow. The most common pitfall is the cold start, mitigated by increasing memory, using global variables, and Provisioned Concurrency. Limits like the 15-minute cap, 6 MB payload, and secrets in environment variables are worked around with Step Functions, S3 presigned URLs, and Secrets Manager.

Next chapter #

With just a function, you lack an entrance to invoke it. The next Chapter 18 API Gateway + Lambda covers the most common pattern for calling Lambda over an HTTP request. It puts together the entrance of a serverless API: the difference between REST API and HTTP API, Lambda integration, routes / methods, authorization (IAM / Cognito / Lambda authorizer), and stages / deployment.

X