Amazon CloudWatch: An ultimate weapon for cloud monitoring.

Sumit
Tensult Blogs
Published in
4 min readDec 11, 2018

--

This blog has moved from Medium to blogs.tensult.com. All the latest content is available there. Subscribe to our newsletter to stay updated.

Monitoring data has become a crucial part of any industry in today’s world. It plays a vital role in any company’s growth. An organization should be aware of all it’s internal working environment with respect to the system architecture as well as how the company is performing in the market, that means how is the customer reacting to the company’s products and services.

Monitoring is also important because it gives the developers and system architects an insight on how their product/services are performing in a production environment. Monitoring a live environment gives an instant result on any customer behavior or any impact that the environment is facing. Constant monitoring is also crucial in terms of competition because if you wish to stay ahead of your competitor, you need a good monitoring system on your environment. So in short, monitoring is vital for the below reasons :

  • Visibility
  • Real-time troubleshooting
  • Customer experience
  • Business success

Amazon Web Services has got an amazing service for exactly the thing that we are talking about, i.e. for monitoring. The service is named Amazon CloudWatch and as the name suggests, it is used to monitor literally everything that you run in your cloud environment. Let’s discuss CloudWatch and it’s various parts in depth and check why and how it is important for your cloud architecture.

Amazon CloudWatch has got many parts which function together to give us the result we want :

  • Metrics
  • Alarms
  • Agent or APIs
  • Dashboards
  • Events
  • Logs

If you wish to read more on the above-mentioned terms and an introduction to CloudWatch, please visit here.

It can be said that solving or troubleshooting an issue is achieved by using metric data in combination with log data and other sorts of information. We can set up automated tasks on the CloudWatch console which can add or remove services from your environment depending on the requirement. Let’s talk about how we proceed through the troubleshooting steps in our system using Amazon CloudWatch.

Collect the data: We collect all the information from the complete system using metrics that we define and logs that it generates over time.

Monitor the metrics: We have the option to create a dashboard using the metrics that we set and that will give the real-time functioning as per the parameters mentioned. This actually gives everything in graph formation for our easy understanding.

Act: Since we cannot monitor every graph 24 hrs and 365 days a year, we need alarms for notifying us of any warning and failures. Using alarms we can monitor the functioning of the metric real-time and also take proper steps to fix the issue.

Analyze: Usually, everything that we monitor and save using graphs is saved in log files for future use and more in-depth troubleshooting if required.

A good example of a company with a huge customer base and using Amazon CloudWatch for monitoring their systems is BBC. Broadcasting journalist content 24 hrs and 365 days a year worldwide is a crucial part of their business. Such a production environment needs to take care of their system stability every second and they know that very well. They make use of the features like Logs, Alarms, and Dashboards among others for keeping an eye on their systems all the time.

For your reference, below is a screenshot of Amazon CloudWatch.

CloudWatch Screenshot

You can see some valuable information already sorted out to the left side of the panel by Amazon. They are the alarms and warnings. We can also set metrics as per our need and see the real-time functioning of them via graphs.
You need to click on the tab named Dashboards to create your own graphs for a more detailed monitoring. One big advantage of keeping such graphs running is that it will be easier to troubleshoot in case of any trouble.

So, to conclude this discussion which focused mainly from a business point of view, let’s list what we now know about CloudWatch.

Collect: We can collect all our data with ease using defaults to build operational visibility. That information mainly includes logs for future analysis.

Correlate: Using CloudWatch, we have the ability to correlate the metrics and logs received from the data collected which in turn will help in faster troubleshooting and understanding the root cause of an issue.

Automate monitoring: We also have the ability to automate the monitoring with new CloudWatch operational dashboards.

--

--