AWS CloudWatch is a service intended for monitoring AWS resources and the applications you run on AWS. CloudWatch enables real-time monitoring of AWS resources such as EC2 instances, RDS database instances, load balancers, etc. You can use CloudWatch to collect and track metrics, collect and monitor log files, set alarms and automatically react to changes in AWS resources. It automatically provides metrics for CPU utilization, latency and request counts. Additional metrics can also be monitored such as memory usage, error rates, etc.
CloudWatch metrics give the users’ visibility into the resource utilization, application performance, and operational health. These can help you make sure that you can resolve technical issues and streamline processes and that the application runs smoothly.
The following concepts are important for your understanding CloudWatch metrics:
- Namespaces: a container for CloudWatch metrics. Metrics in different namespaces are isolated from each other so that metrics from different applications are not accidentally aggregated for computing statistics.
- Metrics: represents a time-ordered set of data points that are published to CloudWatch. It can be thought of as a variable that we need to monitor and the data points are the values of the variable over time. Metrics exist only in the region they are created.
- Dimensions: a name or a value pair that uniquely identifies a metric. You can assign a maximum of 10 dimensions to a metric. Dimensions help you design a structure for your statistics plan.
- Statistics: are metric data aggregation over the time specified by the user. Aggregation is made using the namespace, metric name, dimensions and the data point unit of measure within the time period you specify.
- Percentiles: as the name suggests, the percentile indicates the relative standing of a value in a dataset. It helps you get a better understanding of the distribution of your metric data. Percentiles are used to detect anomalies.
- Alarms: used to initiate actions on your behalf. An alarm monitors a metric over a specified interval of time and performs the assigned actions based on the value of the metric relative to a threshold over time.
CloudWatch can be thought of as a metrics repository. An amazon service (EC2, RDS, etc) will put metrics into the repository and you get statistics based on these metrics. If you put in metrics of custom data into the repository, you get the statistics based on these custom metrics. You can represent the statistics obtained graphically in the CloudWatch console. You can configure alarm actions to take an action (start/stop/terminate) on AWS service. In addition, you can create alarms that initiate Amazon EC2 Auto Scaling and Amazon Simple Notification Service (SNS) actions on your behalf.
Amazon CloudWatch Logs
CloudWatch Logs helps users to access, monitor and store access log files from EC2 instances, CloudTrail, Lambda functions, and other sources. With the help of CloudWatch Logs, you can troubleshoot your systems and applications. It offers near real-time monitoring and users can search for specific phrases, values or patterns. CloudWatch logs are a managed service that can be provisioned with no extra purchases from within your AWS accounts. They are easy to work with from the AWS console or the AWS CLI. They have deep integration with AWS services. They are able to trigger alerts based on certain logs occurring in the logs.
To collect logs AWS offers both a new unified CloudWatch agent and an older CloudWatch Logs agent. AWS recommends using the unified CloudWatch agent. When you install a CloudWatch Logs agent on a EC2 instance, it automatically creates a log group as part of the process. You can also create a log group directly from the AWS console. After the CloudWatch Logs agent begins publishing log data to Amazon CloudWatch, users can begin searching and filtering data based on metric filtering. Metric filters define the patterns and the terms to look for in log data as it is sent to CloudWatch Logs. You can use subscriptions to get access to real time feed of log events from CloudWatch Logs and have it delivered to other services such as an Amazon Kinesis Stream, Amazon Kinesis Data Firehose stream or AWS lambda for custom processing, analysis or loading to other systems.
CloudWatch Events allows users to consume a near real time stream of events as changes to their AWS environment takes place. These event changes can subsequently trigger notifications or other actions. CloudWatch events can monitor actions such as an EC2 instance being launched or shut down and detecting when an auto-scale event occurs. It can also detect when AWS services are provisioned or terminated.
The main components of CloudWatch events are:
- Events: generated in four ways. It is represented by small blobs of JSON. They can arise from within AWS when a resource changes it’s state. They can arise when events are generated by API calls and console sign-ins that are delivered to Amazon CloudWatch Events via CloudTrail. Another way is when your own code can generate application-level events and publish them to Amazon CloudWatch Events for processing. The last way is that they can be issued on a scheduled basis, with options for periodic or Cron-style scheduling.
- Rules: match incoming events and route them to one or more targets or processing. Rules do not have any order for processing, all the rules matching for a particular event will be processed.
- Targets: process events and are specified within the rules. There are four initial target types: built-in, Lambda functions, Kinesis streams, and SNS topics. A single rule can specify multiple targets.
CloudWatch helps in reducing the burden of monitoring. It can be used to monitor metrics on a wide range of AWS services and has the ability to create custom metrics when required. And with the usage of alarms and responses this becomes a very powerful tool for the administrator. CloudWatch can also be integrated into existing infrastructure.
If you would like to go deeper into CloudWatch, feel free to go through some other blogs we have based on CloudWatch —