Monitoring AWS EC2 Instances: Creating Alarms for Disk, RAM, and CPU Usage

Tameem Rafay
4 min readFeb 19, 2024

--

Cloudwatch alarm

In this article, we will go through how to send the data of EC2 Disk and RAM usage on the cloudwatch. So, we create the alarms and send the email when the usage exceeds the set threshold.

We want to add the alerts on these three criteria.

  1. CPU usage above 70%
  2. RAM usage above 70% (Create custom matrix)
  3. Disk usage above 80% (Create custom matrix)

The cloud watch provides the built-in matrix for CPU usage but it does not provide any built-in parameter for the RAM and Disk usage. So, we will send the custom data to Cloudwatch so that we can create the alarm based on the data.

Send Data to Cloudwatch using the CloudWatch agent

Now, we will install the Cloudwatch agent that will be run on the EC2 instance and it will send the data matrix to the Cloudwatch.

Assign role to EC2 instance

We need to create the role for the EC2 instance attach the permission of CloudWatchAgentServerPolicy to this role and then assign this role to the EC2 instance. This permission will allow EC2 instance so, that it can send the logs to cloudwatch.

Now, connect with your EC2 instance using SSH and run the following commands to install the agent then we will create the file amazon-cloudwatch-agent.json where we will write the script that will be run using this agent.

// Install the agent on your EC2 machine
sudo yum install amazon-cloudwatch-agent -y

Cloudwatch agent script

Create the new file using the following command and paste the given script. Here we are going to get the data of disk and RAM after every minute of 60 seconds but we will send the data to Cloudwatch every 5 minutes or 300 seconds.

sudo nano /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json
{
"agent": {
"metrics_collection_interval": 300,
"run_as_user": "root"
},
"metrics": {
"append_dimensions": {
"InstanceId": "${aws:InstanceId}"
},
"metrics_collected": {
"mem": {
"measurement": [
"mem_used_percent"
],
"metrics_collection_interval": 60,
"resources": [
"*"
]
},
"disk": {
"measurement": [
"used_percent"
],
"metrics_collection_interval": 60,
"resources": [
"*"
]
}
}
}
}

Start the Cloudwatch agent

Once you save the above script in the amazon-cloudwatch-agent.json file. Now you have to run the agent so that it will send the data to Cloudwatch.

// start the cloud-watch agent
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s -c file:/opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json

Troubleshoot the agent in case of any error

The following commands will be used in case the agent is not working as expected.

// check the status of cloudwatch agent
sudo systemctl status amazon-cloudwatch-agent.service

// restart the agent if you changed any configuration or if you changed the .json file
sudo systemctl restart amazon-cloudwatch-agent.service

// check the logs of this service in case if your agent is not working as expected
sudo journalctl -u amazon-cloudwatch-agent.service

1. Create a disk usage alarm on the Cloudwatch

This above script will send the data to Cloudwatch watch and now we will create the alarm so that it will send the email using the SNS topic if the value is greater than the set threshold.

Now, create the alarm on Cloudwatch and click on Select the matric. Here you will see the section for a custom namespace named CWAgent (Cloudwatch agent )

In the custom namespace, you will see two sections one for checking the disk space and the other for checking the RAM. So, to add the disk usage alarm I will click on the first one i.e. InstanceId, device, fstype, path namespace.

Here click on the xvda1 device this will allow you to see the current disk usage of your device and here you can set your alarm based on its value.

2. Create a RAM usage alarm on the Cloudwatch

Now again open the custom namespace on the Cloudwatch alarm and click on the instanceId to check the usage of RAM in your instance here select the mem_used_percent matric to set the alarm on the RAM usage.

Final Alarm Dashboard: Here you can see we have set all 3 alarms based on disk, RAM, and CPU utilization.

--

--