Engaging your users with AWS Step Functions
Imagine a new user signs up for your service. You send an automated welcome message to your new user explaining how the service works. But what if your user struggles with the first steps? You want to send a second email with additional information. To abstract this a little bit, the following steps are needed:
- Send a welcome message to the new user.
- Wait some time.
- Check if the user completed the initial steps.
a. If yes, done.
b. If no, continue. - Send a message with additional information to the new user.
- Wait some time.
- Check if the user completed the initial steps.
a. If yes, done.
b. If no, continue. - Send a message to the new user offering a Chime call.
This is nothing more than a state machine. It has a start (new user signed up) and an end (the last message was sent) and a few state transitions in between. With AWS Step Functions, you can implement a state machine. To do so, you have to translate the steps into the right format and implement the business logic. I will use AWS Lambda to implement the business logic in this post. Let’s get started.
Anatomy of a state machine in AWS Step Functions
A state machine in AWS Step Functions can take input data in JSON and consists of states:
- There is one start state that gets the input when starting the state machine.
- Each state can either be an end state or will point to the next state.
- There are one or many end states.
- A state is of a specific type.
- By default, the input of a state is outputted. Some states change this.
In this example, four different state types are used, but there are much more. The four used state types are:
Type | Description |
---|---|
Task | Calls a Lambda function.The event of the Lambda function is the input of the state. By default, the output of the Lambda function is the output of the state. If the Lambda function fails, it can be retried. |
Wait | Waits for a specific amount of time in seconds. You are not billed for the waiting time. |
Choice | So far, a state has only one next state. But sometimes you need to make a choice (e.g., if the user completed initial steps, then ..., else ...). Depending on a precondition, you can have several next states. |
Succeed | Indicates a successful end of a state machine. |
Now, you have to map the engaging steps to states.
Example state machine in AWS Step Functions
The start state is SendMessage1
.
Id | Type | Description | Next |
---|---|---|---|
SendMessage1 |
Task | Send a welcome message to the new user. | Wait1 |
Wait1 |
Wait | Wait some time. | FetchActivityCount1 |
FetchActivityCount1 |
Task | Fetch number of activities the new user performed. | CheckActivityCount1 |
CheckActivityCount1 |
Choice | Did the user completed the initial steps? | If yes, then Done , else SendMessage2 |
SendMessage2 |
Task | Send a message with additional information to the new user. | Wait2 |
Wait2 |
Wait | Wait some time. | FetchActivityCount2 |
FetchActivityCount2 |
Task | Fetch number of activities the new user performed. | CheckActivityCount2 |
CheckActivityCount2 |
Choice | Did the user completed the initial steps? | If yes, then Done , else SendMessage3 |
SendMessage3 |
Task | Send a message to the new user offering a Chime call. | Done |
Done |
Succeed | Done. | - |
Now, the state machine is defined. Are you surprised by states FetchActivityCount1
and CheckActivityCount1
? The step Check if the user completed the initial steps was translated to two states:
- Task
FetchActivityCount1
: Fetch number of activities the new user performed. - Choice
CheckActivityCount1
: Did the user completed the initial steps?.
The reason for this is that a state can either do something (like getting the number of activities performed by the user from the database) or it can make a choice. You can not do both in a single state. Also, the Lambda function cannot perform that choice for you. Only the state machine can make a choice based on input data.
Now, the business logic (states of type Task) needs to be implemented.
Implementing tasks
A task can either call a Lambda function or an activity. If your business logic cannot be implemented with Lambda, you can fall back to activities. I will not cover activities in this example.
Send welcome message
I provide a dummy implementation here in Node.js that fails in 30% of the time to demonstrate how retries work.
|
Fetch number of activities
I provide a dummy implementation here in Node.js that fails in 30% of the time and returns that the user did not complete any activities in 50% of the time.
|
So far, the state machine is not really defined in a machine readable format. You will change this in the next section.
Translate the state machine to JSON
State machines are defined in a JSON document like this:
{ |
The StartAt
property defines the first state in the state machine. Let’s see how states are defined.
The first state is SendMessage1
of type Task
:
"SendMessage1": { |
- The
Resource
property contains the ARN of the Lambda function (e.g.,arn:aws:lambda:$region:$account-id:function:$function-name
). - The
Retry
property defines that if the Lambda function returns an error.- The first retry is performed after
IntervalSeconds
. - The next retries performed after
IntervalSeconds*BackoffRate*NoOfRetry
. - Only retry
MaxAttempts
times.
- The first retry is performed after
- The
Next
property points to the next state.
Now, the message is sent, so we have to wait.
"Wait1": { |
After that, we have to get the number of activities the user did (e.g., query a database).
"FetchActivityCount1": { |
After we have the information, it’s time to make a decision:
"CheckActivityCount1": { |
- The
Choices
property defines an array of rules. Each rule:- Selects a property from the input using JsonPath in
Variable
. - Compares it, e.g., with
NumericEquals
or many others. - Defines the next state in
Next
.
- Selects a property from the input using JsonPath in
- The
Default
property indicates the state of no other state was selected inChoices
.
Finally, the last state is reached.
"Done": { |
Now it’s time to wire everything together with CloudFormation.
CloudFormation template
The following is only an excerpt of the full CloudFormation template.
|
Installation
Download the source code an create a stack:
aws cloudformation create-stack --stack-name example --template-body file://template.yml --capabilities CAPABILITY_IAM |
After some minutes, CloudFormation created a bunch of Lambda functions, IAM roles, and a State Machine for you.
Creating an execution
To create a state machine execution:
- Visit the Step Functions Management Console.
- Click on the only state machine.
- Press the New execution button.
- Supply an Execution id (e.g., 1).
- Press the Start Execution button.
Depending on chance, you will take one of many paths trough the state machine (keep in mind that Lambdas fail in 30% of the time and return no or one activity by chance). Therefore, our execution graph will likely look slightly different.
One thing that I want to highlight is the retry mechanism for Task
states. Below the Visual Workflow, you can see a full log of the execution. Mine looked like this:
- In line 11, the Lambda has executed the first time, but it failed in line 12 at 8:22:43.
- In line 14, the Lambda has executed again at 8:22:45 (exactly 2 seconds later, as defined in the
Retry
property!). - Line 15 tells us that this time, the Lambda executed without an error.
Keep in mind that your log will look different. But you likely see log types of LambdaFunctionFailed
. If no, create a few more execution and look at them.
Clean up
Don’t forget to cleanup the CloudFormation stack:
aws cloudformation delete-stack --stack-name example |
Summary
In this post you learned, that:
- You can implement state machines with AWS Step Functions.
- Each state can do different things depending on the
Type
of the state. - A Lambda function can be called from a state of type
Task
and can be retried in the case of a failure. - The
Choice
state type can select the next state based in input data.
Further reading
- Article Lessons learned: Serverless Chatbot architecture for marbot
- Article Amazon Web Services in Action Second Edition is in the works
- Article The Cloud Switch: IoT Button, Lambda, and CloudFormation
- Article Maintaining an Open Source library of production-ready CloudFormation templates
- Tag cloudformation
- Tag lambda
- Tag serverless