Combine CloudWatch metrics for Auto Scaling or to reduce costs

Andreas Wittig – 17 Dec 2019

Every part of your AWS infrastructure emits utilization metrics. Amazon CloudWatch collects these metrics and allows you to visualize them as well as to define alarms. AWS announced an exciting new feature allowing you to combine multiple metrics recently: IF/AND/OR statements for metric math.

CloudWatch metric math

Combining CloudWatch metrics has several advantages:

  1. Simplify your monitoring configuration by reducing the number of CloudWatch alarms.
  2. Reduce costs by reducing the number of CloudWatch alarms (each alarm costs around USD 0.10 per month).
  3. Increase or decrease the desired capacity of an Auto Scaling Group according to multiple metrics (e.g., the typical bottlenecks CPU, memory, and network).

All you need to do is to define a Metric Math Expression that combines multiple metrics. Doing so results in a calculated metric. Next, you can define a CloudWatch alarm or a visualization based on the calculated metric.

Use metric math to combine multiple metrics

Let’s imagine the following scenario: you are using an Auto Scaling Group to launch EC2 instances. Typical bottlenecks of your virtual machines are:

How do you get notified or scale-out automatically when one of these resources gets scarce? And how do you get notified or scale in automatically when the resources are no longer being used?

The following screenshot shows four basic metrics:

  • CPU Utilization
  • Memory Utilization
  • Network In
  • Network Out

Please note that AWS does not provide a memory utilization metric by default. Therefore, I’m using the CloudWatch Agent to collect the data for a memory utilization metric.

Also, as explained in Monitoring EC2 Network Utilization, you need to combine the Network In and Network Out metric to calculate the total network throughput. The Network Utilization metric calculates the percentage utilization of the network.


Looking for a new challenge?

  • tecRacer

    Cloud Consultant • AWS Migrations

    tecRacer • Premier AWS Consulting Partner • Germany, Austria, Portugal, and Switzerland
    Assessment Transformation Change Management
  • DEMICON

    Senior Lead Full Stack Developer

    DEMICON • AWS Advanced Consulting Partner • Remote
    AWS JavaScript/TypeScript Angular React

However, I want to put your attention on the Summary Utilization metric:

IF(cpu > 70, 1, 0) OR IF(memory > 75, 1, 0) OR IF(network > 80, 1, 0)
  • If the CPU utilization is above 70%, the metric math expression will return 1.
  • If the memory utilization is above 75%, the metric math expression will return 1.
  • If the network utilization is above 80%, the metric math expression will return 1.
  • Otherwise, the metric math expression will return 0.

Metric Math Expression

Next, define a CloudWatch alarm based on the Summary Utilization metric. Use 1 for the threshold.

Metric Math Expression

The alarm will transition into the ALARM state when the CPU utilization is above 70%, or the memory utilization is above 75%, or the network utilization is above 80%. Configure the CloudWatch alarm to send a notification or increase the desired capacity of the Auto Scaling Group.

Do you prefer Infrastructure as Code? The following code snippet shows how to create the CloudWatch alarm with the help of CloudFormation.

Note: The example assumes that you are running a m5.large instance with a maximal network throughout of about 0.75 Gbit/s.

AWSTemplateFormatVersion: '2010-09-09'
Parameters:
AutoScalingGroupName:
Type: String
Resources:
EC2HighUtilization:
Type: 'AWS::CloudWatch::Alarm'
Properties:
AlarmDescription: 'EC2 High Utilization: CPU, memory, or network'
Metrics:
- Id: summary
Label: EC2 Utilization
Expression: IF(cpu > 70, 1, 0) OR IF(memory > 75, 1, 0) OR IF(network > 80, 1, 0)
ReturnData: true
- Id: cpu
MetricStat:
Metric:
Namespace: AWS/EC2
MetricName: CPUUtilization
Dimensions:
- Name: AutoScalingGroupName
Value: !Ref AutoScalingGroupName
Stat: Maximum
Period: 300
ReturnData: false
- Id: memory
MetricStat:
Metric:
Namespace: CWAgent
MetricName: mem_used_percent
Dimensions:
- Name: AutoScalingGroupName
Value: !Ref AutoScalingGroupName
Stat: Maximum
Period: 300
ReturnData: false
- Id: network
Label: Network Utilization
Expression: "((network_in+network_out)/300/1000/1000/1000*8)/0.75*100"
ReturnData: false
- Id: network_in
MetricStat:
Metric:
Namespace: AWS/EC2
MetricName: NetworkIn
Dimensions:
- Name: AutoScalingGroupName
Value: !Ref AutoScalingGroupName
Stat: Sum
Period: 300
ReturnData: false
- Id: network_out
MetricStat:
Metric:
Namespace: AWS/EC2
MetricName: NetworkOut
Dimensions:
- Name: AutoScalingGroupName
Value: !Ref AutoScalingGroupName
Stat: Sum
Period: 300
ReturnData: false
ComparisonOperator: GreaterThanOrEqualToThreshold
EvaluationPeriods: 1
DatapointsToAlarm: 1
Threshold: '1'

That’s all. Happy monitoring!

Summary

As CloudWatch metric math supports IF/AND/OR statements, it is possible to aggregate multiple metrics into a single metric. Doing so allows you to scale an Auto Scaling Group based on multiple metrics as well as reduce the number of CloudWatch alarms, which reduces costs.

This post was originally published on the marbot blog.

Become a cloudonaut supporter

Andreas Wittig

Andreas Wittig ( Email, Twitter, or LinkedIn )

We launched the cloudonaut blog in 2015. Since then, we have published 360 articles, 50 podcast episodes, and 48 videos. It's all free and means a lot of work in our spare time. We enjoy sharing our AWS knowledge with you.

Please support us

Have you learned something new by reading, listening, or watching our content? With your help, we can spend enough time to keep publishing great content in the future. Learn more

$
Amount must be a multriply of 5. E.g, 5, 10, 15.

Thanks to Alan Leech, Alex DeBrie, ANTHONY RAITI, Christopher Hipwell, Jaap-Jan Frans, Jason Yorty, Jeff Finley, Jens Gehring, jhoadley, Johannes Grumböck, Johannes Konings, John Culkin, Jonas Mellquist, Juraj Martinka, Kamil Oboril, Ken Snyder, Markus Ellers, Ross Mohan, Ross Mohan, sam onaga, Satyendra Sharma, Shawn Tolidano, Simon Devlin, Thorsten Hoeger, Todd Valentine, Victor Grenu, and all anonymous supporters for your help! We also want to thank all supporters who purchased a cloudonaut t-shirt.