Your Lambda function might execute twice. Be prepared!
Are you confused when scheduled Lambdas execute twice, SNS messages trigger an invocation three times, your handmade S3 inventory is out of date because events occurred twice? Bad news: Sooner or later, your Lambda function will be invoked multiple times. You have to be prepared! The reasons are retries on errors and event sources that guarantee at-least-once delivery (e.g., CloudWatch Events, SNS, …).
How do you know that your Lambda function is broken (or not idempotent)? If your function is given the same input (aka event) multiple times, the function MUST produce the same result. If your function produces different results with the same input, the implementation is not idempotent, and you are in big troubles.
You may ask yourself how to fix it? Let’s work with a concrete example. Imagine a Lambda function to ensure that a user can only make a specific number request per day. A request could be an upload to an S3 bucket, sending a message, whatever. In other words, the Lambda function implements rate limiting. To do so, you need to store some state. A good place to store the state is DynamoDB. Luckily, DynamoDB offers many features to fix your problem.
1. Iteration: The not idempotent implementation
The first iteration provides the most simple implementation. But also a broken implementation.
The input event looks like this:
{ |
The function uses a DynamoDB table ratelimit
with the primary key id
(partition key). The id consists of the user and the current date (yyyy-mm-dd). Additionally, a calls
attribute of type number is used to track the number of calls.
const AWS = require('aws-sdk'); |
The implementation is not idempotent. The calls
attribute is incremented even if the invocation is just a retry. The implementation would limit too early in this case. Let’s fix that!
2. Iteration: The mostly idempotent implementation
Let’s try to fix the first iteration. Add a request id to the event.
The input event looks like this:
{ |
All event sources provide some unique id that you can use as the request id. Some examples:
- Kinesis:
Records[].eventID
- SNS:
Records[].Sns.MessageId
- API Gateway:
requestContext.requestId
- Scheduled CloudWatch Event:
id
The function uses the same DynamoDB table from the first iteration. Instead of the calls
attribute, you use a requests
attribute of type string set.
The requests
attribute keeps track of all request ids that were already counts by the rate limiter. This way, you avoid to count the same request twice. DynamoDB also ensures that the requests
set is limited to a certain amount of request ids.
const AWS = require('aws-sdk'); |
You can invoke this function as often as you like, as long as the input stays the same, the result is the same.
Wait, no. The date (yyyy-mm-dd) is used as part of the key. So if the current date changes to the next day, the result will be different.
3. Iteration: The idempotent implementation
Let’s fix that as well! The date has to be part of the input event as well. I added a timestamp but you could choose whatever format you like.
The input event looks like this:
{ |
Some event sources provide a date value out of the box. Some examples:
- Kinesis: not supported but a workaround is possible: put the timestamp in the record data
- SNS:
Records[].Sns.Timestamp
- API Gateway: not supported and a workaround is problematic because you would rely on the client’s time
- Scheduled CloudWatch Event:
time
The function uses the same DynamoDB table from the first iteration. You can use the implementation of the second iteration. But replace the way the date
variable is generated:
const date = new Date(event.timestamp).toISOString().slice(0, 10); |
That’s it. Idempotent rate limiting is implemented.
Summary: request ids as a savior
Use request identifiers to implement idempotent Lambda functions that do not break if a function invocation needs to be retried. Make sure to pass the original request id to all subsequent calls. The original request id is generated as early as possible in the chain. If possible on the client side. Avoid generating your own ids.
Use the original request id as the identity to guarantee idempotency:
- In DynamoDB, use it as a primary key together with a
ConditionExpression
orUpdateExpression SET if_not_exists(att, :val)
. - In Step Functions, use it as the execution name.
- For other services like Kinesis and SQS, put the original request id in the payload. The consumer can then use the original request id again.
You are now equipped to implement idempotent Lambda functions using DynamoDB. If you use any other data store, think carefully about idempotency!
Implementation notes
The implementation does not work well with large limits (> 5000) because the requests set will grow with each request added. Sooner or later you will hit the 400 KB limitation of a DynamoDB item.
You could circumvent the limitation by working with smaller timespans (e.g., yyyy-mm-dd-hh or yyyy-mm-dd-hh-mm) which also leads to a lower limit but assumes a more or less even distribution of request during the day.
If you need a big limit, store each request as a separate item in DynamoDB and sum them asynchronously by using a DynamoDB stream on the table. Keep the current state about if a user is limited in a separate DynamoDB table. This is not an exact rate limiter, but with large limits, you could tolerate that users can make slightly more requests that allowed.