Skip to main content

AWS Step Functions

Core concept: Step Functions orchestrate multi-step workflows using state machines โ€” coordinate Lambda, SQS, DynamoDB, ECS, and 200+ AWS services.


Standard vs Express Workflowsโ€‹

FeatureStandardExpress
Max duration1 year5 minutes
Execution modelExactly-onceAt-least-once
Execution historyFull history in consoleCloudWatch Logs only
PricingPer state transitionPer execution + duration
Use caseLong-running business processesHigh-volume, short workflows

State Typesโ€‹

StatePurpose
TaskDo work (invoke Lambda, call API, etc.)
ChoiceBranch based on conditions
WaitPause for a duration or until a timestamp
ParallelExecute branches simultaneously
MapIterate over an array
PassPass input to output (for testing/transformation)
SucceedEnd the workflow successfully
FailEnd the workflow with an error

State Machine Definition (ASL)โ€‹

{
"Comment": "Order Processing Workflow",
"StartAt": "ValidateOrder",
"States": {
"ValidateOrder": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123:function:ValidateOrder",
"Next": "ProcessPayment",
"Catch": [{
"ErrorEquals": ["ValidationError"],
"Next": "SendFailureNotification"
}]
},
"ProcessPayment": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123:function:ProcessPayment",
"Retry": [{
"ErrorEquals": ["States.TaskFailed"],
"IntervalSeconds": 2,
"MaxAttempts": 3,
"BackoffRate": 2.0
}],
"Next": "IsPaymentApproved"
},
"IsPaymentApproved": {
"Type": "Choice",
"Choices": [{
"Variable": "$.paymentStatus",
"StringEquals": "APPROVED",
"Next": "FulfillOrder"
}],
"Default": "SendPaymentFailed"
},
"FulfillOrder": {
"Type": "Parallel",
"Branches": [
{
"StartAt": "UpdateInventory",
"States": {
"UpdateInventory": {
"Type": "Task",
"Resource": "arn:aws:lambda:...:UpdateInventory",
"End": true
}
}
},
{
"StartAt": "SendConfirmationEmail",
"States": {
"SendConfirmationEmail": {
"Type": "Task",
"Resource": "arn:aws:lambda:...:SendEmail",
"End": true
}
}
}
],
"End": true
}
}
}

Error Handlingโ€‹

Retryโ€‹

"Retry": [{
"ErrorEquals": ["Lambda.ServiceException", "Lambda.TooManyRequestsException"],
"IntervalSeconds": 1,
"MaxAttempts": 3,
"BackoffRate": 2.0 // 1s, 2s, 4s
}]

Catchโ€‹

"Catch": [{
"ErrorEquals": ["PaymentDeclined"],
"ResultPath": "$.error", // Preserve error info
"Next": "HandlePaymentError"
}]

Wait for Callback Patternโ€‹

For human approval or external system responses:

"WaitForHumanApproval": {
"Type": "Task",
"Resource": "arn:aws:states:::lambda:invoke.waitForTaskToken",
"Parameters": {
"FunctionName": "SendApprovalEmail",
"Payload": {
"taskToken.$": "$$.Task.Token",
"orderId.$": "$.orderId"
}
},
"TimeoutSeconds": 86400,
"Next": "ProcessApproval"
}

The Lambda sends a taskToken to the approver. They call SendTaskSuccess / SendTaskFailure to resume the workflow.


Map State (Parallel Processing)โ€‹

"ProcessAllOrders": {
"Type": "Map",
"ItemsPath": "$.orders",
"MaxConcurrency": 10,
"Iterator": {
"StartAt": "ProcessSingleOrder",
"States": {
"ProcessSingleOrder": {
"Type": "Task",
"Resource": "arn:aws:lambda:...:ProcessOrder",
"End": true
}
}
},
"Next": "SendSummary"
}

๐Ÿงช Practice Questionsโ€‹

Q1. A workflow needs to process each item in a list in parallel, up to 5 items at a time. Which state type achieves this?

A) Parallel state
B) Choice state with conditions
C) Map state with MaxConcurrency: 5
D) Multiple Task states

โœ… Answer & Explanation

C โ€” The Map state iterates over an array, applying the same workflow to each item. MaxConcurrency controls parallelism. Parallel runs different branches simultaneously, not the same branch for each item.


Q2. A payment workflow needs to pause and wait for a manual approval that may come hours later via an API call. Which pattern enables this?

A) Wait state with a fixed duration
B) Poll DynamoDB every minute for approval
C) Task state with .waitForTaskToken (Callback pattern)
D) Choice state polling an SQS queue

โœ… Answer & Explanation

C โ€” The Callback pattern (waitForTaskToken) pauses the workflow indefinitely. An external system calls SendTaskSuccess or SendTaskFailure with the token to resume. No polling, no fixed wait time.


๐Ÿ”— Resourcesโ€‹