Cloudwatch Agent Configuration and Enabling Alerting Mechanism

Sarath Tamminana
KPMG UK Engineering
6 min readDec 21, 2023

Sometimes we get a requirement to collect the service metrics and configure an alert to perform some operation whenever that particular metric crosses a specific threshold level.

We can simply achieve that by using the Cloudwatch Metrics section and configuring an alarm for it.

But what if we need to collect or monitor the underlying system metrics and do the same?

In this scenario, we need an agent who can stay inside the machine, collect the metrics, and push them to Cloudwatch. A Cloudwatch agent will do the same for us with a minimum initial configuration.

The CloudWatch agent enables you to do the following:

  1. Collect system-level metrics like CPU and RAM from Amazon EC2 instances across operating systems and push them to Console.

2. Collect system-level metrics from on-premises servers. These can include servers in other environments as well, which are not managed by AWS.

3. Frequently collect logs from Amazon EC2 instances and on-premises servers, running either Linux or Windows Server.

As part of the demonstration, I launched an EC2 instance with basic configuration and used that instance to install the Cloudwatch agent package and configuration of custom metrics collection.

PART-1

Steps to configure Cloudwatch Agent in Linux Machine

0. Create an IAM role and attach it to the EC2 instance from which you want to fetch the metrics.

1. Download the package using yum

2. Configure the CloudWatch agent to collect metrics like CPU/memory and push those at 60-second intervals.

3. Start the Cloudwatch Agent using config.json

4. Check the status from CLI as well as from the console.

Create an IAM role and attach it to the EC2 instance from which you want to fetch the metrics.

Attach CloudWatchAgentServerPolicy and create the role

Attach the newly created role to your EC2 instance and update it.

1. Download the package using yum

I logged into the EC2 CLI and checked whether the CloudWatch agent package was available by default.

Command

rpm -qa| grep -i cloudwatch

Now install the CloudWatch agent using YUM.

yum install amazon-cloudwatch-agent

After installation, you can use the same command for validation.

Configure the CloudWatch agent to collect metrics like CPU/Memory and push those at 60-second intervals.

Command

/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard

During configuration, you will get an option to select the metric configuration modes like basic, standard, and advanced as shown below. For simplicity purposes, we will select basic and proceed with the execution.

Basic Config Mode

Mem: mem_used_percent

Disk: disk_used_percent

The disk metrics such as disk_used_percent have a dimension for Partition, which means that the number of custom metrics generated is dependent on the number of partitions associated with your instance. The number of disk partitions you have depends on which AMI you are using and the number of Amazon EBS volumes you attach to the server.

Standard Config Mode

CPU: cpu_usage_idle, cpu_usage_iowait, cpu_usage_user, cpu_usage_system

Disk: disk_used_percent, disk_inodes_free

Diskio: diskio_io_time

Mem: mem_used_percent

Swap: swap_used_percent

Advanced Config Mode

CPU: cpu_usage_idle, cpu_usage_iowait, cpu_usage_user, cpu_usage_system

Disk: disk_used_percent, disk_inodes_free

Diskio: diskio_io_time, diskio_write_bytes, diskio_read_bytes, diskio_writes, diskio_reads

Mem: mem_used_percent

Netstat: netstat_tcp_established, netstat_tcp_time_wait

Swap: swap_used_percent

In our case, I am selecting 2 for the default setup to concentrate on Cloudwatch metrics collection.

Once the setup is done, we can see a new file “config.json” will be created in the below path

/opt/aws/amazon-cloudwatch-agent/bin/config.json

We will use this config.json for starting the Cloudwatch Agent.

Command to check the status

/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -m ec2 -a status

Please make sure to install collectd before starting the cloudwatch agent to avoid startup errors.

Command to install Collectd

yum install collectd

Once the package is installed, you can start the cloudwatch agent and check the status

Command to start the service

/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json

Once the cloudwatch agent is started, you can see the status. It will be running.

Before Cloudwatch Agent setup, we can see in the console we don’t have any CWAgent Namespace.

After Installation, we can see a new name space is created called as CWAgent which holds all the custom metrics fetched by Agent.

PART-2

Steps to configure Cloudwatch Alert and enable Notification mechanism

  1. Validate metrics availability in CWAgent Namespace.

2. Create an SNS Topic and configure the subscription.

I created a topic named cloudwatchsns and subscribed for the same.

3. Create an alert in Cloudwatch for triggering a notification at 80% Threshold.

Go to Alarm the All Alarms, then click on Create Alarm

Click on select metric and choose the metric you want to monitor. In our case, I am choosing memory usage for alarm configuration.

On the next page once you select Metric, please update the condition threshold to 80 as shown below.

Click Next and proceed with the default options as shown below.

Give the alarm name and complete the alarm creation.

It will come to the insufficient data stage once we create it. After 5 minutes it will move to OK State.

After 5 minutes

So in this way, we can collect EC2 instance custom metrics and configure an Alerting mechanism for the same.

--

--

Sarath Tamminana
KPMG UK Engineering

Certified AWS & GCP Cloud Architect currently working at KPMG as a Assistant Manager. About 10 years of experience in Cloud, DevOps & Middleware Technologies