EC2 Instances 2.0 - Time to Update Your Toolbox

Michael Wittig – 28 Jan 2020

Amazon Elastic Compute Cloud (EC2) has more than 13 years of public history and is one of the oldest AWS services. EC2 is a mature service that reinvented itself many times:

Refresh your EC2 knowledge

Do you prefer listening to a podcast episode over reading a blog post? Here you go!

But there are still two approaches when it comes to managing EC2 instances: mutable and immutable.

A mutable EC2 instance is created once and then lives for many years. Humans log on to the machine (e.g., via SSH or RDP) and do their work. OS updates are applied to the running system; new packages are installed from time to time; configuration files are modified when needed. Deployments happen while the EC2 instance is running.

An immutable EC2 instance is never changed after creation. If you want to update the OS, you create a new EC2 instance that starts from a fresher image (AMI). If new packages are needed, a new AMI is created that contains those packages. If a new deployment is necessary, a new AMI is built and rolled out be replacing the EC2 instances. The EC2 instance is ephemeral and must not be used to persists data!

In this blog post, I will focus on the mutable approach and show you how to solve everyday challenges with the tools and features that AWS provides in 2020:

  • Patching
  • Backup and Restore
  • Remote Access
  • Software Deployments
  • Monitoring
  • Logs
  • Single Point of Failure

If you prefer the immutable approach, Packer by HashiCorp is still the best approach to create AMIs.

You can find a link to a CloudFormation template with the implementation of all best practices at the end of the article.

Patching

As soon as you launch an EC2 instance, you have to ask yourself one question: How can I keep this machine up-to-date? The best option today is provided by AWS Systems Manager (SSM). A combination of the following capabilities (aka Patch Manager) allows us to patch EC2 instances during a predefined window in a configurable way:

  • Patch Baseline: Defines which patches are approved for installation on your instance (e.g., install critical patches 7 days after they are releases).
  • Document AWS-RunPatchBaseline: The script that installs the patches approved by the baseline.
  • Maintenance Window: Executs the document on a set of EC2 instances within a recurring time window.

The default patch baseline for Amazon Linux 2 looks like this:

Default Patch Baseline for Amazon Linux 2

The maintenance window is configured to run every day at 12:35 UTC (this is one of the few places in AWS where you can set your timezone!)

Maintenance Window

You also get full insights into the executions of the maintenance window executions.

Maintenance Window executions

Use CloudWatch Event Rules to subscribe to failures. Our Slack bot marbot can set up the CloudWatch Event Rules for you.

Backup and Restore

Mutable EC2 instances likely contain data that needs to be backed up. The best way to perform backups of EC2 instances is AWS Backup. AWS Backup allows us to backup EC2 instances during a predefined window and manages the lifecycle of a backup as well (e.g., delete backups after 30 days).

The following screenshot shows a list of daily backups. You can restore any of these backups right through AWS Backup.

AWS Backup backups

You also get full insights into the backup jobs.

AWS Backup jobs

Use AWS Backup Events to subscribe to failures.

Keep in mind that EC2 backups performed by AWS Backup are “crash consistent”. Writes not flushed to disk can cause data corruption.

Remote Access

To modify a mutable EC2 instance, you likely want to open an SSH/RDP connection to your instance. Remote access comes with several challenges:

  • configuration of security groups
  • distribution of credentials
  • rotation of credentials
  • SSH client needs to be installed and configured on your machine

The less painful approach is to use AWS SSM Session Manager. Session Manager is integrated into the AWS Management Console and can also be used in your terminal.

AWS Backup backups

Keep in mind that your IAM permissions now also manage who can become root on any EC2 instance.

Software Deployments

Deploying a new software release is a risky task. Instead of uploading a new release to the EC2 instance manually, I recommend to using AWS CodeDeploy. CodeDeploy helps you to deploy your software in an automated way with automatic rollback if things go wrong.

Monitoring

A lot of useful information is published to CloudWatch by default:

  • CPU utilization
  • Network IO
  • Disk IO

AWS CloudWatch CPU utilization

What information is missing?

  • Memory
  • Disk usage

The missing metrics can be collected with the Unified CloudWatch Agent best installed via SSM.

Create CloudWatch Alarms to monitor if a metric reaches a threshold.

Logs

Mutable EC2 instances are around for some time. You can search tough the logs as usual: Open a remote session and open the log files on your editor of choice.

If you want to centralize your logs, I recommend to ship them to CloudWatch Logs. The Unified CloudWatch Agent that you learned about before can pipe the logs from the EC2 instance to CloudWatch Logs. With CloudWatch Logs Insights, you can search and visualize the logs with ease.

Single Point of Failure

Remember that a single EC2 instance is always a single point of failure (SPOF). The risk of a failing hypervisor can be limited by configuring automatic instance recovery. Instance recovery does not protect your instance from Availability Zones outages.

Keep in mind that the EC2 SLA does not cover single instances.

Summary

Managing a mutable EC2 instance comes with many responsibilities. In this post, I showed you how to solve everyday challenges by leveraging the latest and greatest capabilities of the AWS platform.

Find a full implementation codified into two CloudFormation templates (al2-mutable-public.yaml and al2-mutable-private.yaml) on Github: https://github.com/widdix/aws-cf-templates/tree/master/ec2

Michael Wittig

Michael Wittig

I’ve been building on AWS since 2012 together with my brother Andreas. We are sharing our insights into all things AWS on cloudonaut and have written the book AWS in Action. Besides that, we’re currently working on bucketAV, HyperEnv for GitHub Actions, and marbot.

Here are the contact options for feedback and questions.