AWS Monitoring with EventBridge

Andreas Wittig – 09 Mar 2023

When it comes to AWS monitoring, you probably think of Amazon CloudWatch first. That’s right, but there is another source of information about the health of your cloud infrastructure: Amazon EventBdrige. In this blog post, you’ll learn how to tap into EventBridge to get important information about running your cloud infrastructure.

Builder's Diary

How to configure AWS Monitoring based on EventBridge?

The following diagram shows what is needed to extend your AWS monitoring with the help of EventBridge.

  1. Services like AWS Backup, Amazon EC2, and AWS Health publish events when things go wrong or human intervention is necessary to EventBridge.
  2. EventBridge rules filter those events based on a pattern and forward matching events to an SNS topic.
  3. The SNS topic sends events to on-call engineers via Email, SMS, or HTTPS.

AWS Monitoring with EventBridge: Architecture

I have compiled some examples of EventBridge rules and event patterns for AWS monitoring in the following. The code snippets are written in Terraform configuration syntax, but the event patterns can also be used with CloudFormation or even in the AWS Management Console.

Monitoring AWS account root user login with EventBridge

The following EventBridge rule (formerly known as CloudWatch event rule) defined in Terraform configuration syntax ensures you are notified whenever someone uses the AWS account root user to log in.

resource "aws_cloudwatch_event_rule" "root_user_login" {
name = "aws-monitoring-root-user-login"
event_pattern = <<JSON
{
"detail-type": [
"AWS Console Sign In via CloudTrail"
],
"detail": {
"userIdentity": {
"arn": [
"arn:${data.aws_partition.current.partition}:iam::${data.aws_caller_identity.current.account_id}:root"
]
}
}
}
JSON
}

data "aws_partition" "current" {}

data "aws_caller_identity" "current" {}

Monitoring AWS Health announcements with EventBridge

Over the past years, AWS improved the AWS Health Dashboard and is using this channel to communicate outages as well as breaking changes to services. Getting notified about new issues helps ensure your cloud infrastructure’s continuity.

resource "aws_cloudwatch_event_rule" "health_issue" {
name = "aws-monitoring-health-issue"
event_pattern = <<JSON
{
"source": [
"aws.health"
],
"detail-type": [
"AWS Health Event"
]
}
JSON
}

Monitoring EC2 Auto Scaling with EventBridge

EC2 Auto Scaling launches instances, for example, to add capacity to a fleet. It is crucial to get notified if Auto Scaling fails to launch or terminate an instance, as human intervention is most likely required to fix the problem.

resource "aws_cloudwatch_event_rule" "auto_scaling_failed" {
name = "aws-monitoring-auto-scaling-failed"
event_pattern = <<JSON
{
"source": [
"aws.autoscaling"
],
"detail-type": [
"EC2 Instance Launch Unsuccessful",
"EC2 Instance Terminate Unsuccessful",
"EC2 Auto Scaling Instance Refresh Failed"
]
}
JSON
}

Monitoring EBS Snapshots with EventBridge

Do you rely on creating EBS snapshots for backing up data? If so, you should keep an eye on failed EBS snapshots by using the following EventBridge rule.

resource "aws_cloudwatch_event_rule" "ebs_failed" {
name = "aws-monitoring-ebs-failed"
event_pattern = <<JSON
{
"source": [
"aws.ec2"
],
"detail-type": [
"EBS Snapshot Notification",
"EBS Multi-Volume Snapshots Completion Status"
],
"detail": {
"result": [
"failed"
]
}
}
JSON
}

Monitoring SSM Automation with EventBridge

The Systems Manager provides a toolkit to automate the management of EC2 instances. But will you notice when automation fails during the night? The following EventBridge rule will keep you posted.

resource "aws_cloudwatch_event_rule" "ssm_automation_failed" {
name = "aws-monitoring-ssm-automation-failed"
event_pattern = <<JSON
{
"source": [
"aws.ssm"
],
"detail-type": [
"EC2 Automation Execution Status-change Notification"
],
"detail": {
"status": [
"Failed",
"TimedOut"
]
}
}
JSON
}

Monitoring ECS Tasks with EventBridge

The Elastic Container Service (ECS) orchestrates containers. But sometimes containers fail and exit with an exit code > 0. The following EventBridge rule will ensure you are getting notified about the issue.

resource "aws_cloudwatch_event_rule" "ecs_task_failed_non_zero" {
name = "aws-monitoring-ecs-task-failed-non-zero"
event_pattern = <<JSON
{
"source": [
"aws.ecs"
],
"detail-type": [
"ECS Task State Change"
],
"detail": {
"group": [{"anything-but": {"prefix": "service:"}}],
"lastStatus": ["STOPPED"],
"stopCode": ["EssentialContainerExited"],
"containers": {
"exitCode": [{"anything-but": 0}]
}
}
}
JSON
}

Monitoring ECR Image Scan with EventBridge

The Elastic Container Registry (ECR) comes with the capability to scan container images for known vulnerabilities. But how do you ensure you are notified about severe findings? Here you go.

resource "aws_cloudwatch_event_rule" "ecr_image_scan_finding" {
name = "aws-monitoring-ecr-image-scan-finding"
event_pattern = <<JSON
{
"source": [
"aws.ecr"
],
"detail-type": [
"ECR Image Scan"
],
"detail": {
"scan-status": ["COMPLETE"],
"finding-severity-counts": {
"$or": [
{"CRITICAL": [{"numeric": [">", 0]}]},
{"HIGH": [{"numeric": [">", 0]}]},
{"MEDIUM": [{"numeric": [">", 0]}]},
{"UNDEFINED": [{"numeric": [">", 0]}]}
]
}
}
}
JSON
}

Monitoring Amazon Certificate Manager (ACM) with EventBridge

We have all experienced downtimes caused by expired SSL/TLS certificates. This doesn’t have to be the case. Monitor the Amazon Certificate Manager (ACM) and get notified when certificates expire.

resource "aws_cloudwatch_event_rule" "acm_certificate_approaching_expiration" {
name = "aws-monitoring-acm-certificate-expiration"
event_pattern = <<JSON
{
"source": [
"aws.acm"
],
"detail-type": [
"ACM Certificate Approaching Expiration"
],
"detail": {
"DaysToExpiry": [1, 2, 3, 7, 14, 21, 28, 35, 42, 49, 56, 63, 70]
}
}
JSON
}

Monitoring AWS Backup with EventBridge

Sometimes backup services like AWS Backup provide false security. After all, what happens if AWS Backup runs into errors when creating important backups? You can use the following EventBridge rule to catch backup errors.

resource "aws_cloudwatch_event_rule" "backup_failed" {
name = "aws-monitoring-backup-failed"
event_pattern = <<JSON
{
"source": [
"aws.backup"
],
"detail-type": [
"Backup Job State Change",
"Copy Job State Change",
"Restore Job State Change"
],
"detail": {
"state": [
"FAILED"
]
}
}
JSON
}

Monitoring Elastic Beanstalk with EventBridge

Elastic Beanstalk is a popular service for deploying web applications on AWS. It would help if you were the first to know about a problem with your application. The following EventBridge rule notifies you about issues with your Elastic Beanstalk applications.

resource "aws_cloudwatch_event_rule" "elastic_beanstalk_failed" {
name = "aws-monitoring-elastic-beanstalk-failed"
event_pattern = <<JSON
{
"source": [
"aws.elasticbeanstalk"
],
"detail-type": [
"Elastic Beanstalk resource status change",
"Other resource status change",
"Health status change",
"ManagedUpdateStatusChangeEnabled"
],
"detail": {
"Severity": [
"WARN",
"ERROR"
]
}
}
JSON
}

Summary

When it comes to AWS monitoring, EventBridge is an essential source of information. Ensure you are using EventBridge rules forwarding events to an SNS topic to get notified about issues with your cloud infrastructure.

All examples from this blog post originate from marbot-io/terraform-aws-marbot-monitoring-basic.

Also, please check out our product marbot to roll out monitoring based on CloudWatch and EventBridge with ease.

Andreas Wittig

Andreas Wittig

I’ve been building on AWS since 2012 together with my brother Michael. We are sharing our insights into all things AWS on cloudonaut and have written the book AWS in Action. Besides that, we’re currently working on bucketAV and marbot.

Here are the contact options for feedback and questions.