CloudFormation cfn-init pitfall: Auto scaling and throttling error rate exceeded
cfn-init
is a little helper to install and configure EC2 instances managed with CloudFormation. Lately, I was running into issues when starting a more significant amount of EC2 (let’s say 50) during an auto scaling event. This blog post will teach you why the error happens and how to avoid it.
Introducing cfn-init
cfn-init
configuration is added as metadata to a resource using the AWS::CloudFormation::Init
key. The following example configures cfn-init
to
- create/update the file
/etc/sample.conf
. - enable & start the service
sample
(also restarts the service if/etc/sample.conf
is changed).
cfn-init
is usually executed in the user data script.
VirtualMachine: |
The pitfall
The way cfn-init
is implemented is this:
- Call the CloudFormation API
DescribeStackResource
to read the metadata. - Validate and parse the metadata.
- Apply the configuration to the EC2 instance.
Unfortunately, the CloudFormation API has notorious low API rate limits, and cfn-init
does not retry in the case of a rate exceeded error. Therefore, when many EC2 instances run cfn-init
more or less at the same time, you will see the following error:
2022-06-02 07:13:14,838 [DEBUG] Response: 400 https://cloudformation.us-east-1.amazonaws.com/?Action=DescribeStackResource&LogicalResourceId=ScanLaunchTemplate&ContentType=JSON&StackName=***&Version=2010-05-15 [headers: {'x-amzn-RequestId': '***', 'Content-Type': 'application/json', 'Content-Length': '124', 'Date': 'Thu, 02 Jun 2022 07:13:14 GMT', 'Connection': 'close'}] |
Solving the issue
How can we solve the issue?
- Do not use
cfn-init
at all. - Load the metadata from a file.
To load the metadata from a file and not the CloudFormation API, create a file (e.g., metadata.json
) like this:
{ |
And invoke cfn-init
like this:
/opt/aws/bin/cfn-init -v metadata.json |
I hope this article will help you avoid the pitfall.