AWS, Elasticsearch, CloudWatch, Analyze data and logs

Paul Skarseth
4 min readDec 1, 2015

--

On October 1st Amazon announced that Elasticsearch is now available as a new service in their ever-growing AWS environment. Elasticsearch is one of the most popular open source search engines, used by many organizations, so this is a welcomed offering by Amazon. It saves you from setting up and managing your own high-availability cluster, and the pricing isn’t too bad. One of the biggest features they introduced along with this service is however buried towards the end of the announcement — namely its integration with CloudWatch Logs.

With this integration you can centralize and index any server logs you want, helping you analyze the state of both your application and infrastructure. It can serve as a replacement for a traditional ELK stack, which can be troublesome to get up and running properly. It’s a great boon to your instrumentation and I’ll take you through how to quickly get up and running.

Preparation

In this example the application will be based on Ruby on Rails running on an Ubuntu server, but any language and framework is fine, all we need is the log file we want to analyze. The following steps assumes an intermediate familiarity with AWS, and if you want to follow along verbatim you will need an EC2 instance with rails already installed; otherwise skip to the next section.

Log into your EC2 instance, clone this very simple Ruby on Rails application I made for this article, and start the rails server:

$ git clone git@github.com:krigar/aws-elasticsearch.git
$ cd aws-elasticsearch
$ bundle install
$ bundle exec rake assets:precompile
$ RAILS_ENV=production rails s -d -b 0.0.0.0

Direct your browser to the public IP of your EC2 instance with port 3000 and you should be greeted by a headline with three links at the top. Do not run rails applications like this in an actual production environment, by the way.

CloudWatch Logs

I’ll be using the AWS Console for all of the AWS configuration to make it easier to follow, but all of this can also be done through the API.

IAM

To transfer the server logs to the CloudWatch service we need to create an IAM user with the appropriate permissions.

  1. Go to your AWS account’s IAM section.
  2. Create a user called cloudwatch-logs and save the access keys.
  3. Attach the following custom inline policy to the user:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents",
"logs:DescribeLogStreams"
],
"Resource": [
"arn:aws:logs:*:*:*"
]
}
]
}

Agent

Log into your EC2 instance where the Ruby on Rails application is running and execute the following commands, specifying your chosen region:

$ sudo apt-get update
$ wget https://s3.amazonaws.com/aws-cloudwatch/downloads/latest/awslogs-agent-setup.py
$ sudo python ./awslogs-agent-setup.py --region <YOUR-REGION>

This will start the interactive setup of the agent. Below is a screenshot of my configuration, with access keys edited out. The timestamp format used here is specific for Ruby on Rails, so if you’re uploading a different log be mindful to change it accordingly.

Interactive setup

Proceed to the CloudWatch service in your AWS console, click Logs on the left-hand side and verify that your log appears.

Note: This process is specific for configuring CloudWatch Logs on existing EC2 instances, for new instances follow AWS’ documentation.

Elasticsearch

To create the Elasticsearch domain which will hold our logs, go to the Elasticsearch service in the AWS console and start the creation wizard.

Select t2.micro as the instance type and EBS as the storage, keep everything else as default. You want to have a strict policy in place for the domain since anyone with access can tamper with the data, so choose Allow access to the domain from specific IP(s) and provide your public IP so you can view the Kibana interface.

Integration

Once the Elasticsearch domain is in the active state we’re ready to start streaming the log to it.

Locate your log in the CloudWatch service, mark it and click Actions, an option saying Start Streaming to Amazon Elasticsearch Service should be among the choices.

Select your newly created Elasticsearch cluster and then choose to create a new IAM role for the Lambda execution, accepting the defaults it presents you. The next step is to determine the log format; this is a bit of a tricky one for Ruby on Rails logs since exception stack traces breaks the format, but this is the best filter pattern I’ve found:

[debug_code, datetime, debug_level, separator_dash, status]

Different logs will naturally require different patterns. Once you’re happy with what it returns, review the settings, and start streaming.

Kibana

Kibana is an interface for Elasticsearch to help analyze and visualize its data. It makes it easy to run queries and set up dashboards for your logs.

Go to the Elasticsearch service, click on your domain, and then proceed to its Kibana endpoint. Once it’s done with its initial setup, set the default index pattern to cwl-* and the time-field name to @timestamp. To verify that your log is streaming correctly to the Elasticsearch cluster, click Discover at the top of the Kibana interface, and you should be presented with a simple query feature with a list of all the most recent log entries.

Kibana Discover

If you set up the Ruby on Rails application you can create more entries by browsing its pages, they should appear in Kibana within a few seconds.

You’re now ready to harness the power of logs streaming directly into an Elasticsearch cluster, visualized through Kibana. I highly recommend reading the Kibana user guide to get the most out of it.

--

--