AWS Step Functions vs EventBridge
- Both Step Functions and EventBridge are serverless services for coordinating workflows, but they serve fundamentally different purposes.
- Step Functions orchestrates multi-step workflows with state management and error handling.
- EventBridge routes events between services based on content-based rules without maintaining state.
- They are often used together – EventBridge triggers Step Functions workflows based on events.
Step Functions vs EventBridge Comparison
| Feature | Step Functions | EventBridge |
|---|---|---|
| Pattern | Orchestration (centralized control) | Choreography (decoupled routing) |
| State Management | Yes – tracks execution state, input/output between steps | No – stateless event routing |
| Execution Model | Sequential, parallel, branching, looping | Fire-and-forget event delivery |
| Duration | Standard: up to 1 year; Express: up to 5 minutes | Near real-time delivery (no duration concept) |
| Error Handling | Built-in Retry, Catch, Fallback states | Dead-letter queue on target delivery failure |
| Visibility | Visual workflow graph, step-by-step execution history | Rule match metrics, limited execution visibility |
| Targets/Integrations | 200+ AWS service integrations (direct SDK calls) | 200+ AWS service targets per rule |
| Event Sources | Triggered by API call, EventBridge, API Gateway, Lambda | 90+ AWS services, SaaS partners, custom apps |
| Filtering | Choice state (conditions on input data) | Content-based filtering on event body (event patterns) |
| Parallelism | Parallel state, Distributed Map (millions of items) | Multiple targets per rule (fan-out) |
| Human Approval | Yes – Task tokens with callback pattern | No native support |
| Scheduling | Wait state (delay steps) | EventBridge Scheduler (cron/rate/one-time) |
| Replay | Redrive failed executions (2024) | Event Archive and Replay |
| Pricing | Standard: per state transition; Express: per request + duration | Per event published ($1/million) |
AWS Step Functions
- Serverless workflow orchestration – coordinates multiple AWS services into visual workflows.
- Standard Workflows – up to 1 year, exactly-once execution, full execution history, ideal for long-running processes.
- Express Workflows – up to 5 minutes, at-least-once, high-volume event processing (100K+ executions/second).
- States: Task, Choice, Parallel, Map, Wait, Pass, Succeed, Fail.
- Direct SDK integrations – call 200+ AWS services without Lambda (DynamoDB PutItem, SQS SendMessage, ECS RunTask, Bedrock InvokeModel).
- Distributed Map – process millions of items from S3 in parallel (up to 10,000 concurrent executions).
- Callback pattern – pause workflow, wait for external system/human approval via task token.
- Error handling – Retry with exponential backoff, Catch with fallback states, per-step timeout.
- Redrive (2024) – restart failed executions from the point of failure without re-running completed steps.
- Variables and JSONata (2024) – workflow-level variables and powerful data transformation expressions.
- Best for: Multi-step processes needing coordination, error handling, human approval, long-running workflows, batch processing.
Amazon EventBridge
- Serverless event bus – routes events between decoupled services based on rules.
- Receives events from 90+ AWS services automatically without configuration.
- Content-based filtering – event patterns match on any field in the event JSON body.
- Multiple targets per rule – fan-out a single event to up to 5 targets.
- EventBridge Scheduler – millions of one-time or recurring schedules (replaces CloudWatch Events).
- EventBridge Pipes – point-to-point with filtering, enrichment, and transformation between source and target.
- Event Archive and Replay – store events indefinitely for reprocessing or debugging.
- Schema Registry – auto-discover event schemas for code generation.
- Global endpoints – automatic failover to secondary region.
- SaaS integrations – receive events from Zendesk, Datadog, Shopify, Auth0, etc.
- Best for: Event-driven architectures, reacting to AWS service changes, decoupled microservices, SaaS integration, scheduling.
When to Choose Which
- Choose Step Functions when:
- You need to coordinate multiple steps in a specific order
- Workflow requires error handling with retries and fallbacks
- You need visibility into which step succeeded/failed
- Process requires human approval or external callbacks
- Long-running processes (minutes to months)
- Batch processing of millions of items (Distributed Map)
- Choose EventBridge when:
- You need to react to events from AWS services or SaaS apps
- Services should be decoupled (producers don’t know about consumers)
- Routing based on event content to different targets
- You need scheduling (cron jobs, one-time future events)
- Fan-out: one event triggers multiple independent actions
- Cross-account or cross-region event routing
- Use Both Together: EventBridge detects an event (e.g., S3 upload) → triggers Step Functions workflow → orchestrates multi-step processing (validate → transform → load → notify).
AWS Certification Exam Practice Questions
- An order processing system requires validating payment, checking inventory, reserving items, charging the card, and sending confirmation – each step depends on the previous one succeeding. If payment fails, the reserved items must be released. Which service handles this?
- EventBridge with multiple rules
- Step Functions with error handling (Catch/compensating actions)
- SQS with multiple queues
- SNS with filter policies
- A company wants to automatically trigger different Lambda functions when EC2 instances change state (running, stopped, terminated) – each state routes to a different function. Which service is most appropriate?
- Step Functions with Choice state
- CloudWatch Alarms
- EventBridge with content-based rules
- SNS with message filtering
- A data pipeline processes millions of S3 objects in parallel, with each object needing 3 transformation steps. The pipeline must track progress and retry individual failures. Which approach is recommended?
- EventBridge Pipes with SQS
- Lambda triggered by S3 events
- Step Functions Distributed Map
- EventBridge with Lambda targets
- A workflow requires pausing execution until a human reviews and approves a document via an external web application (may take hours or days). Which feature supports this?
- EventBridge wait pattern
- Step Functions callback pattern with task token
- SQS visibility timeout
- Lambda with DynamoDB polling
- A company needs to schedule 2 million one-time reminder notifications to be sent at specific future times (each different). Which service handles this at scale?
- Step Functions Wait state
- CloudWatch Events cron
- EventBridge Scheduler
- SQS delay queues
Related Posts
- AWS SQS vs SNS vs EventBridge
- AWS Lambda vs Fargate vs App Runner
- AWS Serverless Services Cheat Sheet
- AWS Certified Solutions Architect – Associate Exam Learning Path