AWS Certified Developer - Associate (DVA-C02) #5 Domain 1-4 Development with AWS Services — Messaging and Events
Now that we’ve covered the data layer in #4 DynamoDB, it’s time for asynchronous messaging that loosely connects components. A serverless architecture hits scaling limits with synchronous invocation alone. The exam repeatedly presents scenarios asking “which of SQS, SNS, EventBridge, and Step Functions should you choose?” Let’s first pin down each of the four services’ roles in one sentence.
| Service | One-line definition |
|---|---|
| SQS | Pile work in a queue and have one consumer pull and process it (point-to-point) |
| SNS | Publish a message to many subscribers at once (pub/sub, push) |
| EventBridge | An event bus that routes events by rules (SaaS/schedule integration) |
| Step Functions | Orchestrate multiple steps as a state machine |
SQS — queues #
SQS is a fully managed queue where producers put messages into a queue and consumers poll to pull them out. It decouples components and smooths throughput (buffering).
Standard queue vs FIFO queue #
| Aspect | Standard | FIFO |
|---|---|---|
| Ordering | Not guaranteed (best-effort) | Strict ordering |
| Delivery | At-least-once (duplicates possible) | Exactly once (deduplication) |
| Throughput | Practically unlimited | Limited (raised with batching) |
If “ordering matters / you must prevent duplicates,” it’s FIFO; if throughput is the top priority, it’s standard. Standard queues allow duplicates, so consumers must be idempotent.
Visibility timeout #
When a consumer pulls a message, that message is invisible to other consumers for the duration of the visibility timeout. If you process and delete it within this time, you’re done; if you fail to delete it, the message reappears in the queue and is reprocessed.
Exam trap: if processing takes long but the visibility timeout is short, a message still being processed becomes visible again and is processed twice. The answer is to increase the visibility timeout or extend it with
ChangeMessageVisibility.
Long polling and DLQ #
- Long polling — to reduce empty responses, it waits until a message arrives (up to 20 seconds). It’s the recommended setting to cut the cost of empty receives.
- DLQ — sends messages that failed processing as many times as
maxReceiveCountto a separate queue, isolating them. It prevents a poison pill from clogging the queue.
SNS — pub/sub #
SNS pushes one message to many subscribers at once. Subscribers include Lambda, SQS, HTTP/S, email, SMS, and more. If SQS is “one consumer pulls,” SNS is “push to all subscribers.”
Fan-out pattern (SNS + SQS) #
A regular exam pattern. When you subscribe multiple SQS queues to an SNS topic, a single published message enters multiple queues at once, and each queue’s consumer processes independently.
- Each consumer processes at its own pace (buffering).
- One consumer’s failure doesn’t affect the others.
- It’s easy to add new subscribers (queues) later.
The answer to “one event must be processed independently by multiple systems, and each needs durability and retries” is the SNS + SQS fan-out. With SNS alone, it’s hard to buffer and reprocess a message when a subscriber fails.
EventBridge — event bus #
EventBridge is an event bus that filters events by rules and routes them to targets. It looks similar to SNS but has a different grain.
| Aspect | SNS | EventBridge |
|---|---|---|
| Model | Topic subscription, simple, high throughput | Event pattern matching, rich routing |
| Filtering | Message attribute filters | Content-based pattern matching |
| Sources | Direct publish | AWS service events, SaaS partners, custom |
| Schedule | None | Built-in cron/rate schedule |
| Schema | None | Schema registry |
The key distinction: simple high-speed fan-out is SNS; routing events from diverse sources by content, or needing schedules, is EventBridge. The answer to “run a Lambda at a set time every day” is an EventBridge schedule rule (formerly CloudWatch Events).
Step Functions — orchestration #
It coordinates a multi-step workflow as a state machine. Branching, parallelism, retries, waits, and error handling are expressed not in code but in a definition (ASL, Amazon States Language).
| Type | When it fits |
|---|---|
| Standard | Long-running (up to 1 year), exactly once, audit history needed |
| Express | Short (up to 5 minutes), high-volume, high-frequency event processing |
When you need to “invoke several Lambdas in order and conditionally, with retries/compensation on failure,” the correct approach is not to invoke them directly inside a Lambda but to orchestrate them with Step Functions. The state transitions are visible at a glance, and you can handle each step’s retries and error handling declaratively.
Service selection quick table #
| Requirement keyword | Answer |
|---|---|
| Decouple work and buffer, one consumer | SQS |
| Ordering, deduplication | SQS FIFO |
| Push one to many subscribers | SNS |
| One event → multiple systems each process durably | SNS + SQS fan-out |
| Content-based routing, SaaS events | EventBridge |
| Run at a set time/interval | EventBridge schedule |
| Coordinate a multi-step workflow, retries, branching | Step Functions |
Exam question patterns #
- “Smooth out work while protecting the backend during a traffic spike.” → SQS (buffer).
- “Process orders in the order received, without duplicates.” → SQS FIFO.
- “Have payment, notification, and analytics systems each process one event.” → SNS + SQS fan-out.
- “React to SaaS events like GitHub/Datadog.” → EventBridge.
- “Run a cleanup job at midnight every day.” → EventBridge schedule rule.
- “A 5-step approval workflow with retries and branching.” → Step Functions.
- “A message was consumed again while being processed.” → Extend the visibility timeout.
Wrap-up #
What this post locked in:
- SQS queues (decouple/buffer), standard (at-least-once) vs FIFO (ordering, deduplication)
- Visibility timeout, long polling, and DLQs are core to SQS operations
- SNS pub/sub, and durable multi-consumer via the SNS + SQS fan-out
- EventBridge — content-based routing, SaaS, schedules
- Step Functions — multi-step workflow orchestration (Standard vs Express)
Next — Domain 1-5 SDK Development Patterns #
There are common patterns that recur when you call services in code. In #6 SDK Development Patterns, I’ll cover pagination, exponential backoff and jitter for handling throttling, idempotency implementation, S3 multipart upload and presigned URLs, and the SDK credential provider chain.