Getting Familiar with AWS CloudWatch: A Guide to Logging and Insights
Agenda
CloudWatch is a monitoring and logging service provided by AWS. In this post, let’s focus on the logging component.
CloudWatch allows users to collect, parse, query, analyse, visualise and act on the log events.
In this post we will discuss the following CloudWatch concepts:
- Log Group
- Log Stream
- Log Events
- Filtering
- Log Insights
- Queries
- Visualizations
- Dashboards
In a follow-up post, we will discuss CloudWatch monitoring and alerting.
This will be a hands-on post with relevant screenshots and videos.
Prerequisites
- AWS account with CloudWatch access
You are all set if you are able to access https://console.aws.amazon.com/cloudwatch/home.
Background
Logs are a crucial part of any high traffic application. It allows engineers to identify and troubleshoot operational issues.
A log statement in a Python application would look like:
logger.info(json.dumps({'event': 'user_login', 'user_id': 10, 'channel': 'google'}))
A log statement in a Node.js app would look like:
logger.info(JSON.stringify({'event': 'user_login', 'user_id': 10, 'channel': 'google'}))
The log statements would either get written to standard output or a file. However querying and analysing logs on a server or host becomes highly challenging.
Hence you require a centralised logging solution. CloudWatch is one such centralised logging solution.
The host would have a background job or a running agent which would periodically ship the logs from the host to the centralised logging solution.
Shipping logs from host to CloudWatch is outside the scope of this article.
In this post, we will explicitly generate logs on Cloudwatch itself.
Log Events
Any log statement in your application creates a log event. Hence, the following creates a log event.
logger.info(json.dumps({'event': 'user_login', 'user_id': 10, 'channel': 'google'}))
It creates a log event with content:
{'event': 'user_login', 'user_id': 10, 'channel': 'google'}
Assume there is a function which performs user registration and then dispatches an email to the user.
It would probably have the following two log statements:
logger.info(json.dumps({'event': 'user_registered', 'user_id': 10, 'channel': 'google'}))
... some function code
logger.info(json.dumps({'event': 'email_sent', 'user_id': 10, 'email_id': 'ramanujan@gmail.com'}))
It would create two log events.
Hence, any event that you want captured in your application is essentially a log event.
In the following sections, we would see how log events are created in CloudWatch.
Log Group
Log Group provides ability to group related log events together.
A single AWS account might be used for multiple applications. We would expect log events of one application to be grouped in one logical entity while log events of another application grouped in another logical entity.
Or an application might consist of multiple services, say a catalogue service and an analytics service.
You might prefer to have log events of catalogue service separate from analytics service. In such a case a log group can be created for each individual service.
Let’s create a Log Group named tutorial.
Navigate to CloudWatch > Log groups > Create log group.
Enter Log group name as tutorial.
Change the Retention setting to 3 days. This will ensure that Log events are discarded after 3 days and you stop incurring charges for events storage.
A Log group named tutorial should appear.
Log Streams
A Log Group can have multiple Log Streams. A Log Stream provides a further level of distinction within a log group.
You might want a new log stream to be created for each new deployment of the application. And all log events for this deployment would be part of the log stream.
Let’s create a Log Stream named first inside Log Group tutorial.
Navigate to Log group tutorial > Create log stream.
Enter Log stream name as first.
We can also create another Log stream in this log group. Let’s create another Log stream named second.
Log Events
As described earlier, anything which you want captured in your application is a log event.
Ideally, you would never create Log events explicitly on CloudWatch.
Your application would emit logs, which would be written on the host. An agent running on the host would ship the logs to CloudWatch.
As that is outside the scope of this tutorial, let’s explicitly create the Log events.
Log events always belong to a log stream.
Let’s navigate to any Log stream. We can use log stream first.
As no events have been captured or created yet in this stream, hence you would see the message No older events at this moment.
Let’s add an event. We will use a JSON string.
{"event": "user_login", "user_id": 1, "channel": "google"}
This message would start appearing in the Log events table.
Notice, how each property and the associated values shows on separate lines. CloudWatch is intelligent to identify JSON strings and show different properties on separate lines.
Let’s add few more events. These events will make querying meaningful
{"event": "user_login", "user_id": 2, "channel": "facebook"}
{"event": "user_login", "user_id": 3, "channel": "google"}
{"event": "user_login", "user_id": 4, "channel": "google"}
{"event": "user_login", "user_id": 5, "channel": "google"}
{"event": "user_login", "user_id": 6, "channel": "facebook"}
{"event": "user_login", "user_id": 7, "channel": "github"}
{"event": "user_login", "user_id": 8, "channel": "github"}
{"event": "user_login", "user_id": 1, "channel": "facebook"
Once these events are added, the Log events page should start looking like the following
Filtering
Log events allow filtering. You would notice a search box, which can be used to filter events.
Let’s filter for term facebook.
The resultant list would show three events. All these events have channel as facebook.
This approach has a limitation though. This approach would also fetch unstructured log event messages with term facebook in it.
To verify, add another Log Event with message Posting to facebook. This is a simple string message, and not a JSON string.
Again filter for facebook. The new log event would also show up in resultant list.
A better approach is to use filter patterns. Filter patterns allows users to specify conditions and only log events matching the conditions are returned in the result.
Our intention is to fetch events where channel is facebook.
There are two major rules while writing CloudWatch filter patterns.
- Filter patterns have to be enclosed in {}.
- Set off property selectors with a dollar sign followed by a period (“$.”)
Let’s find events where channel is facebook using filter pattern.
The filter pattern we used is:
{$.channel = "facebook"}
Our intention is to filter login events where channel is facebook. We should ideally apply filter on two properties, i.e event and channel. The filter pattern should be:
{$.event = "user_login" && $.channel = "facebook"}
Let’s say we want to find out all login events of user 1. The filter pattern would be:
{$.event = "user_login" && $.user_id = 1}
It would return all logins of user 1 irrespective of the channel
Filters can only help us get so far. It can return relevant log events. However, it cannot be used to perform advanced operations like aggregations and visualisations. That’s where Log insights come into picture.
Log Insights
Log Insights can be used to search and analyse your log data. Logs Insights has it’s own purpose-built query language. This query language also provides multiple powerful commands which we will see shortly.
Let’s switch to Log Insights and run the default autopopulated query.
We could only see a single Log event. This happened because the log events were created around 2 hours back, and the time range selected at the top is 1 hour.
We changed the time range to 3h, i.e 3 hours, and again triggered Run query.
We could see recent 20 Log events of the last 3 hours.
The query looks like:
fields @timestamp, @message, @logStream, @log
| sort @timestamp desc
| limit 20
Three Log Insights query language command are used here:
- fields
- sort
- limit
When a Log event is added to CloudWatch, CloudWatch automatically associates the following fields with the log event
- @timestamp: The time at which the event was added to CloudWatch
- @message: The raw message for the Log event
- @logStream: The log stream name
- @log: The log group name
Let’s modify the query to retrieve only fields @timestamp and @message. We don’t want to sort the log data, and also don’t want to limit it.
Hence, the query we used is:
fields @timestamp, @message
The output Logs table shows us only two columns in this case, i.e @timestamp and @message.
The log events we have added are JSON string with following properties:
- event
- channel
- user_id
CloudWatch is intelligent to recognise that and parses the JSON string message and discover these fields. We can run the queries on these fields too. Try the following query:
fields event, channel, user_id
Let’s attempt a query which can tell us the number of users login grouped by channel.
Log Insights provide a command called stats which allows grouping by fields and running aggregate functions on them. The query would look like:
stats count(user_id) by channel
The log output table shows 4 rows. One each for facebook, github and google. There is one row where channel is empty. This happened because we have one unstructured log event, remember event Posting to facebook?
Our intention is to only work on user login events. Hence, modify the query to the following:
stats count(user_id) by channel
| filter event="user_login
This will ensure that only user login events are considered, hence the event Posting to facebook would be excluded.
Also, we want the resultant column to be named num_users instead of count(user_id). Also, we want the output to be sorted by decreasing number of users. The resulting query would be:
stats count(user_id) as num_users by channel
| filter event="user_login"
| sort num_users desc
Saving Queries
Every time you navigate out of Log Insights, the selected Log Group and the query are lost.
Queries can be saved. This can help run complex queries when needed, without having to re-create them each time.
Visualizations
Using command stats and aggregate functions, we can get data in a format which can be graphed.
We can visualise this data using different charts supported by CloudWatch.
Let’s create a pie chart which shows the number of logins per channel.
Let’s see how we can create this visualisation.
A line chart needs a time series data. Let’s write a query which finds number of logins grouped every 1 minute
stats count(user_id) as num_users by bin(1m)
| filter event = "user_login"
Run this query and switch the visualization to Line. You should see a Line chart.
Dashboard
CloudWatch allows creating dashboards. One dashboard can have multiple visualizations.
Let’s add the bar chart, pie chart and line chart on a dashboard. The created dashboard would look like:
Navigate to Dashboards > Create dashboard
This should have created a Dashboard called Tutorial with one Line chart visualization. Let’s add the other two visualizations.
You can change the time range from 1h, i.e 1 hour to 12h, i.e 12 hours. The visualizations and dashboard will update in real-time with time range change.
Recap
We started with Log Groups and saw how one Log Group can have multiple Log Streams.
We understand what a Log Event represents and how it looks on CloudWatch. We saw how CloudWatch can parse JSON string log events and discover the fields.
We then moved to filtering and querying and explored the related concepts like Filter patterns and Logs Insights.
We visualised the filtered and aggregated events using different visualisations.
And finally, we explored CloudWatch dashboards and added several visualisations to the dashboard.
Take this highly engaging AWS Quiz to check your AWS knowledge!
Happy Coding!!