📣 Limited offer: subscribe to cloudonaut plus, get a t-shirt for free

📣 Limited offer: free cloudonaut t-shirt

High availability is a no-brainer: EC2 auto-recovery

Andreas Wittig – 09 Nov 2015

Werner Vogels (CTO of AWS) is quoted with “Everything fails all the time.”. This does not mean AWS is an unreliable cloud provider. Quite the contrary: AWS plans for failure. All services are highly available or fault tolerant. Some of them by default, some of them offer tools to achieve this goal.

Problem

An EC2 instance (virtual machine) is not highly available by default. The underlying virtualization layer, the operating system of the host system or the hardware of the host system are possible points of failure. If one of these parts break, the EC2 instance will become unavailable.

Solution

AWS offers tools to handle the failure of an EC2 instance. The following figure shows the easiest way to recover from a failure:

  1. The EC2 instance fails for one of the previously described reasons.
  2. A health check of the EC2 instance is performed automatically in the background and reported to CloudWatch, the monitoring service from AWS.
  3. A CloudWatch alarm triggers the recovery of the EC2 instance if the health check detects a failure.
  4. A new EC2 instance will be started automatically to replace the failed one.
  5. The new EC2 instance is a clone of the failed EC2 instance. The ID, the private and public IP addresses will stay the same. As long as data is stored on EBS volumes, no data is lost.

EC2 auto-recovery process

Special offer: cloudonaut t-shirt

Do you love our blog posts and podcast episodes? Unlock our weekly videos and online events by subscribing to cloudonaut plus.

Special offer: Join cloudonaut plus before November 30th, and we will send you a cloudonaut t-shirt for free.

Subscribe now!

The following components are needed to setup auto-recovery for EC2 instances:

  • EC2 instance from C3, C4, M3, M4, R3, or T2 family
  • CloudWatch alarm based on health check
  • ElasticIP if you want to keep the same public IP address after an auto-recovery

Use CloudFormation template

I have written a template that you can use to launch an EC2 instance with auto-recovery. It uses Infrastructure as Code to create the needed components and links. You can use AWS CloudFormation to create your EC2 instance with auto-recovery in minutes. The GitHub repository widdix/aws-cf-templates contains the CloudFormation template for EC2 with auto-recovery and some more useful templates.

Next steps

This solution can recover a failed EC2 instance. But it is only able to recover the EC2 instance in the same availability zone (also known as a data center). If the whole availability zone is affected by an outage, your EC2 instance will fail. It is possible to plan for an outage of an availability zone, too. If you are interested, I can recommend our book Amazon Web Services in Action or the AWS documentation about Auto Scaling and ELB.

Andreas Wittig

Andreas Wittig

I’m an independent consultant, technical writer, and programming founder. All these activities have to do with AWS. I’m writing this blog and all other projects together with my brother Michael.

In 2009, we joined the same company as software developers. Three years later, we were looking for a way to deploy our software—an online banking platform—in an agile way. We got excited about the possibilities in the cloud and the DevOps movement. It’s no wonder we ended up migrating the whole infrastructure of Tullius Walden Bank to AWS. This was a first in the finance industry, at least in Germany! Since 2015, we have accelerated the cloud journeys of startups, mid-sized companies, and enterprises. We have penned books like Amazon Web Services in Action and Rapid Docker on AWS, we regularly update our blog, and we are contributing to the Open Source community. Besides running a 2-headed consultancy, we are entrepreneurs building Software-as-a-Service products.

We are available for projects.

You can contact me via Email, Twitter, and LinkedIn.

Briefcase icon
Hire me