Hot off the Cloud: October 2022
What happened at AWS in October 2022? This is our summary and analysis of the announcements that interested us.
In November, re:Invent, AWS’ major conference, will take place in Las Vegas. During re:Invent, AWS will announce many new features and services. Therefore, it was a bit quieter in October. We mainly saw more minor announcements, but they definitely made all of our lives better.
First of all, what’s EBS Snapshots Archive? An EBS snapshot costs you $0.05/GB-month. With the EBS Snapshots Archive’s cold-storage option, you are only paying $0.0125/GB-month. Sounds great? Not really, because standard snapshots are incremental backups. So when you snapshot a volume with 100 GB, change 1 GB of data, and create a second snapshot, you are paying for 101 GB consumed storage only. But, the EBS Snapshots Archive does only support full snapshots. So, when archiving the two snapshots from the example, you have to pay 200 GB.
It doesn’t take a genius to realize that the EBS Snapshots Archive is cheaper if the blocks on the EBS volumes change significantly. For volumes with little change, switching to the cold-storage option may even be more expensive.
The Amazon Data Lifecycle Manager is a feature of the EC2 service which can backup EC2 instances and volumes. That’s precisely what AWS Backup does as well. Besides that, AWS Backup supports a bunch of other services as well. The Data Lifecycle Manager seems to be the old-fashioned way to back up EBS volumes. Therefore, surprisingly, this feature made it to the light of day.
In summary, this announcement is not essential for most of us. However, we learned a lot while looking into the details.
Hurray, one less reason to use the AWS root user. It is important to note that AWS introduced a new IAM service prefix called supportplans. Besides that, we are waiting for the possibility of subscribing to a support plan for the whole organization.
Use a CloudWatch alarm to monitor the rollout of changes to EC2 instances and stop an automation, a run command, the state manager, and maintenance windows if a CloudWatch Alarm flips from OK to ALARM. We cannot think of how our workloads -primarily run on single EC2 instances- could benefit from this. But this is interesting when rolling out changes to vast fleets of EC2 instances.
Fun fact, you could even do some chaos engineering with this feature. This makes the AWS Fault Injection Simulator jealous.
This one got our attention right away. File Cache promises a POSIX interface for accessing files on S3 or NFS. We needed to access S3 buckets from EC2 instances when migrating legacy applications. However, most solutions out there come with significant drawbacks. File Cache is an interesting approach, as it provides a central layer for S3, allowing you to lock files and invalidate the cache.
Amazon File Cache is part of the FSx service. But it’s not a file system like FSx for Windows File Server. File Cache is based on Lustr and requires installing a client on the machines that need to access the cache.
At first glance, the pricing model looks exciting: $1.330 per GB-month. However, when we tried to create our first cache, we noticed that the minimum storage required for the cache is 1.2 TB. So Amazon File Cache starts at about $1600 per month. Not what we expected.
Amazon SQS announces Server-Side Encryption with Amazon SQS-managed encryption keys (SSE-SQS) by default
Werner Volges once said “Dance like no one is watching, encrypt like everyone is.” and we are glad to observe, that AWS progresses on its path to encrypt everything by default.
Amazon Virtual Private Cloud (VPC) now supports two new CloudWatch metrics to measure and track network address usage
Large organisations, especially those that peer their VPCs with their lokal networks, are limited by the available IP address space. Now, AWS provides two additional CloudWatch metrics to monitor the network address usage (NAU):
Please note, that those metrics are not enabled by default.The following AWS CLI command does the trick.
aws ec2 modify-vpc-attribute --vpc-id vpc-xyz --enable-network-address-usage-metrics
AWS measures NAU units.
- 1 NAU per IPv4/IPv6 address assigned to a network interface.
- 6 NAUs per Lambda function with VPC integration.
- 6 NAUs per NAT Gateway and VPC endpoint.
While diving into the details about the new metrics, we learned about the following VPC quotas, that we never heared about before.
Each VPC can have up to 64,000 NAU units by default and up to 256,000 by requesting a quota increase.
If a VPC is peered with other VPCs, the VPCs combined can have up to 128,000 NAU units by default. You can request a quota increase up to 512,000.
VPCs that are peered across different regions or with Transit Gateway do not contribute to this limit.
The most intersting fact is, that the NAU quotas do not correlate directly with the number of availble IP addresses.
We reviewed Aurora Serverless v2 about 6 months ago. One of the issues, we identified was the missing CloudFormation support. This is now a thing of the past!
However, there are still two main issues why we do not recommend Aurora Serverless v2 for most scenarios.
First, Aurora Serverless v2 is quite expensive. Our statement from the review is still up to date:
[…] using Aurora Serverless v2 makes sense for workloads that are idling for more than 77% of the time compared to on-demand instances. Or even worse, only for workloads idling more than 96% of the time compared to reserved instances with a three-year term and all-upfront payment.
Second, there is still no Data API available, which is a must-have for connecting Lambda with Aurora Serverless v2, in our opinion.
IAM Access Analyzer now reviews your AWS CloudTrail history to identify actions used across 140 AWS services and generates fine-grained policies
The part of IAM Access Analyzer that generates IAM policies based on CloudTrail is useless. The data set, that AWS uses to generate the policies is incomplete. Many so called data events are missing. For example, DynamoDB reads, SQS messages, and many more. And don’t get us started about the costs for S3 and Lambda data events.
The other part of IAM Access Analyzer, that checks IAM policies is a good starting point when reviewing the security of your AWS accounts.
We are huge fans of the ARM processor archtitecture in general and Graviton2 in particular. That’s why we switched marbot, our AWS monitoring chatbot, to Graviton2. Doing requried us to modify a few YAML/SAM files and took about 10 minutes.
AWS promises a better performance and lower costs when switching to Graviton2.
[…] Graviton2, using an Arm-based processor architecture, are designed to deliver up to 19% better performance at 20% lower cost for a variety of Serverless workloads […]
We could not noticy any performance improvements. But we are happy about the fact, that our monthly Lambda bill will drop from $3 to $2.40.
AWS IQ now supports partners and independent consultants in Australia, Europe, Japan, and other regions
On the one hand, that’s great news for independent consultans and small consulting firms.
On the other hand, be warned: this will be a race to the bottom. There is basically nothing to differentiate yourself from the competitors besides AWS certifications and reviews. So basically, you are competing on price with consultants from all over the world.
If you decide to particpate in the race to the bottom, think about where you will end up when you are winning.
AWS Glue Crawlers support incremental Amazon S3 crawling on existing AWS Glue Data Catalog tables
Instead of crawling all objects within a bucket, Glue now supports incremental crawling. That’s great because all those ListBucket and GetObject calls can become expensive. Here is how incremental crawling works:
- S3 sends event notifications to SQS
- Glue crawler starts periodically (e.g., once a day)
- Glue crawler fetches event notifications from SQS
- Glue crawler only scans through modified S3 objects
AWS Lambda now supports event filtering for Amazon MSK, Self-Manged Kafka, Amazon MQ for Apache ActiveMQ, and Amazon MQ for RabbitMQ as event sources
Lambda now supports filtering events from Kafka, ActiveMQ, and RabbitMQ. As filtering is free of charge, this allows you to reduce your Lambda costs, in case you had to implement filtering yourself before.
Interesting to know: the syntax for defining filters for Kafka, ActiveMQ, and RabbitMQ is the same as for EventBridge rules.
Be warned when developing and testing filter rules: it can take up to 15 minutes for changes to filter rules to take effect. So be patient!
From now on, the following quotas apply per account and region:
- Public AMIs: 5
- Number of entities to share an AMI with: 1,000
- Public and private AMIs: 50,000
All three quotas are adjustable upon request.
We guess that AWS introduced these quotas to a) spot issues with public AMIs that should be private and b) avoid expensive costs due to large amounts of AMIs.
The query engine v3 introduces new features and built-in functions. None of those were out of interest for our use cases, but there are engineers out there who have been waiting to exact these features.
Besides that, AWS promises 20% performance improvement. However, our queries take about 10% longer when running on v3. Therefore, we cannot yet recommend v3 without hesitation.
This announcement sounded great! However, the Cost Explorer did not change significantly. We could not identify any new features or significant modifications. AWS rebuilt the Cost Explorer based on their latest UI kit. A bit disappointing.
Amazon SQS announces increased throughput quota for FIFO High Throughput (HT) mode to up to 6,000 Transactions Per Second (TPS)
This announcement made us think about replacing Kinesis Data Streams, which we currently use as the backbone for marbot, our AWS Monitoring chatbot. With Kinesis Data Streams, we benefit from ordered events and a built-in retry mechanism. Kinesis requires provisioning shards, whereas SQS charges per request. Therefore, we could save money by switching from Kinesis to SQS.
A Kinesis shard supports up to 1,000 transactions per second with the possibility to add additional shards to a stream to scale the throughput. Compared to that, an SQS FIFO queue now supports up to 6,000 transactions per second. Scaling beyond that requires creating an additional queue. To support that, the sender needs some logic to distribute events between two or multiple queues. We use a Lambda function to process the messages, so implementing the receiver side should be simple: adding another event source mapping pointing to the same function.
What are your thoughts on Kinesis Data Streams vs. SQS FIFO HT?
Until now, it was common that values from the parameter store or secrets manager were passed to a Lambda function via environment variables. However, by doing so, the values were handed over unencrypted.
The new extension Parameters and Secrets remedies this by introducing a local endpoint to retrieve values on-the-fly. For example, the following HTTP request returns a value from the Parameter Store:
The parameters and secrets get cached for 5 minutes.
What we don’t like about the solution is that it is not a feature of Lambda but a Lambda layer that you deploy along with your function. You are running code provided by AWS in the form of a Lambda layer. In our experience, that also means that we, as a customer, are in charge when things go wrong, as this is not part of the managed service.
Since then, AWS has released SDKs simplifying the process of integrating IVS into a web or mobile app. Now, there is even an SDK to embed a stream chat.
We will definitely play around with that and might host our live streams on AWS instead of YouTube in the future.
IAM Identity Center adds session management features for improved user experience and cloud security
This announcement got our attention because we’d love to configure the session timeout, especially for temporary credentials fetched by aws sso login.
However, it seems like extending the session timeout does not have any effect on the temporary AWS credentials.
Also, we tested deleting the session of a user authenticated via Google. After deleting the session, the user could still access the portal and the management console. Therefore, we created a bug report and asked AWS for clarification.
In theory, adding a dark mode for your web application should not be a big deal. However, AWS celebrates this announcement with exaggerated enthusiasm. Maybe releasing dark mode was much harder than it should be.
Anyway, we can’t recommend dark mode. Unless you like being blinded by a white screen now and then.
We have been reading through the documentation of AWS Nitro Enclaves. To be honest, it is not that easy to get your head around the secure enclaves by AWS. In summary, Nitro Enclaves provide a secure virtual machine coupled with your EC2 instance that you can use to process sensitive data.
Now, almost all modern instance types support Nitro Enclaves. In our opinion, the most important use case is support for Amazon Certificate Manager (ACM) certificates for EC2 instances. AWS provides a service that you can run to update the certificates for Apache/NGINX. To access ACM, the EC2 instance uses a Nitro Enclave.