📣 Limited offer: subscribe to cloudonaut plus, get a t-shirt for free

📣 Limited offer: free cloudonaut t-shirt

AWS SLA: Are you able to keep your availability promise?

Andreas Wittig – 31 Jan 2019

Are you offering availability of 99.99% or more to your clients? Bad news, you might not be able to keep your promise!

Recently AWS announced a bunch of new Service Level Agreements (SLA). Therefore, it is now possible to calculate the expected availability of most of the architectures on AWS.

Calculate

Typically, each SLA contains:

  1. Service Commitment defines the availability objective (e.g., monthly uptime of at least 99.99%)
  2. Definitions specifies the used terms. Most importantly defines how to measure the availability of a service.
  3. Service Credits states how AWS compensates customers affected by missed availability objectives (e.g., 30% service credit).
  4. Exclusions defines which circumstances are not covered by the service commitment.

Also, it is essential to distinguish between two different availability definitions:

  • per period used by EC2, ELB, and RDS.
  • per request used by Route 53, S3, Lambda, and DynamoDB.

Generally, each SLA covers a service deployed within multiple Availability Zones within a region.

The following table lists the SLA published by AWS (see * for details).

Service SLA Type
Route 53 100.0% * request
ELB/ALB/NLB 99.99% * period of time
EC2 99.99% * period of time
EBS 99.99% * period of time
EFS 99.9% * period of time
ECS 99.99% * period of time
Fargate 99.99% * period of time
RDS 99.95% * period of time
API Gateway 99.95% * request
Lambda 99.95% * request
DynamoDB 99.99% * request
S3 99.9% * request
CloudFront 99.9% * request
Step Functions 99.9% * request
Cognito 99.9% * request
Amazon MQ 99.9% * period of time
Secrets Manager 99.9% * request
ECR 99.9% * request
EKS 99.9% * period of time
Kinesis Video Streams 99.9% * request
Kinesis Data Firehose 99.9% * request
Kinesis Data Streams 99.9% * request
EMR 99.9% * request

So, how to calculate the expected availability of your AWS architecture? To do so, we make two assumptions:

  1. Whenever one of the services fails, it affects the client.
  2. There is no dependency between the services. They all fail independently from each other.

In that case, we need to multiply the availability objective of each service. The following figure shows an example of a typical web application running on EC2.

Special offer: cloudonaut t-shirt

Do you love our blog posts and podcast episodes? Unlock our weekly videos and online events by subscribing to cloudonaut plus.

Special offer: Join cloudonaut plus before November 30th, and we will send you a cloudonaut t-shirt for free.

Subscribe now!

Combined SLA for EC2 Architecture

You can use the same approach to calculate the availability for your serverless application as illustrated in the following figure.

Combined SLA for Serverless Architecture

The shown examples result in an expected availability of 99.80% to 99.92% depending on the involved services.

Next, calculate the expected availability for your architecture. Please note, that our expected availability calculated for your architecture is pessimistic because our assumptions cover the worst case.

However, keep in mind that the expected availability does only cover failures within your cloud infrastructure. It does not include an error budget for your software or failed deployments, for example.

Are you able to keep your promise?

Are you looking for a way to increase the expected availability of your architecture? Deploy your workload to multiple regions. But be warned, doing so comes with additional complexity caused by the need to synchronize your data between multiple regions.

Tags: aws sla
Andreas Wittig

Andreas Wittig

I’m an independent consultant, technical writer, and programming founder. All these activities have to do with AWS. I’m writing this blog and all other projects together with my brother Michael.

In 2009, we joined the same company as software developers. Three years later, we were looking for a way to deploy our software—an online banking platform—in an agile way. We got excited about the possibilities in the cloud and the DevOps movement. It’s no wonder we ended up migrating the whole infrastructure of Tullius Walden Bank to AWS. This was a first in the finance industry, at least in Germany! Since 2015, we have accelerated the cloud journeys of startups, mid-sized companies, and enterprises. We have penned books like Amazon Web Services in Action and Rapid Docker on AWS, we regularly update our blog, and we are contributing to the Open Source community. Besides running a 2-headed consultancy, we are entrepreneurs building Software-as-a-Service products.

We are available for projects.

You can contact me via Email, Twitter, and LinkedIn.

Briefcase icon
Hire me