Centralized Logging in Microservices using AWS Cloudwatch + Elasticsearch

Matias De Santi
Wolox

--

Tracking a request in a microservices-oriented architecture can be a headache. Let’s say you have a case like the following and one of the clients’ requests returns an Internal Server Error.

How can you track down the request to the exact location of where the error occurred? If you have only one server per microservice you might be able to ssh to all the servers and tail the logs. However, if you happen to have 2+ servers per service this is not something we recommend you do.

Additionally, how can you match incoming requests to microservice B with the corresponding request in service A? (A must if you are tracking down a bug.)

We are here to help! We came across this same problem in a project we are currently working on and decided it was time to build a centralized logging system that could gather all our application logs into a single service. At the time of designing this logging system, we had these premises in mind:

  1. It should work in real-time
  2. It should be easy to search
  3. It should help our devs correlate requests between microservices

The Background info

The project is being hosted on AWS and services are deployed on ElasticBeanstalk. This means that at any given time we can have more than one server per service depending on the application’s traffic.

Our hosted applications are either written in Ruby on Rails or Express.js and each one outputs its logs into a log file. This file is usually located in /var/log/puma/puma.log or /var/log/nodejs/nodejs.log depending on the application server or the language in which the app was developed.

The logic behind this

Having used Elasticsearch + Kibana to search across logs before, we decided to continue using these tools to set our ideal architecture for this logging system.

Each microservice has its servers that output their logs to a log file. These log files should be streamed to a centralized log receiver, where they are processed and then forwarded to Elasticsearch. Once there, they can be queried by Kibana.

Matching requests between microservices

In order to be able to match requests between microservices, we decided to assign a request id to every request a microservice receives unless x-request-id header is present. Hence, when Microservice A wants to make a request to Microservice B it will forward the requestId he has received/generated. This is something that is already done by some AWS services such as API Gateway and Lambda.

The Step-by-Step Breakdown

Application Logs

Our first stop towards a centralized logging system is formatting our application’s logs. It is a good practice to have your applications logs under a common format. This way it makes it simple for anyone to move across applications and gives your logs a structure which is easy to understand. You can choose whatever format you like, but we suggest using JSON format.

Rails logs in JSON format

To get your Rails application in JSON format all you need to do is add logstash-logger to your Gemfile. In your production.rb you can configure the fields the log will have. For example:

A quick note here. We are using Request Store to store the subject id (the user who made the request) and the request id as a global per-request variable.

Express logs in JSON format

For Express, we decided to use Winston. You can see the configuration here:

However, building a logger service for Express presented a further challenge. There is no notion of per-request variables that can be used in order to keep track of the current requestId. That meant that we had to propagate the requestId all over our method calls in order to use it whenever we had to log or propagate the logger service. After further research, we came across Continuation Local Storage, a project that provides nodejs with an execution context.

Centralizing application logs

Now that we have our logs in JSON format, it is time to stream them to a centralized service. We chose this service to be Cloudwatch Logs because it has a pretty straightforward way of forwarding the logs to Elasticsearch.

In order to forward our application logs to Cloudwatch Logs, we need to include this file in our .ebextensions folder:

This will start streaming our application logs located in /var/log/puma/puma.log to Cloudwatch Logs.

Streaming logs to Elasticsearch

If you haven’t already created an Elasticsearch Cluster, go ahead and do so. Once that is complete, follow these steps:

  1. Visit Cloudwatch Logs dashboard and find your application logs.
  2. Select the entry and, under the Actions dropdown, choose ‘Stream To Amazon Elasticsearch Service’.
  3. Choose your Elasticsearch Cluster and click next. AWS will automatically create a lambda function that will be in charge of forwarding the logs.
  4. Choose JSON as log format
  5. That’s it!

Now your logs are being streamed to Elasticsearch. Cloudwatch will forward the logs to an index called cwl-YYY.MM.DD.

Viewing your logs

Enter Kibana and configure your index in the Settings view. As index name or pattern enter cwl-*. This is how Kibana is notified to load all the indexes that have that pattern.

If you go to the discover window, you will see the logs. You can select various fields in order to have a cleaner and ordered view.

For example, this is what a log might look:

Conclusions

After experimenting and researching various ways to centralize logging, we found this method to be easy to set up and get working (thanks to AWK built-in integration between services). And, of course, help trace errors and bugs over different API’s which made tracking a request in a microservices-oriented architecture a whole lot simpler . Give it a go, and see if this works for your projects needs!

Feel free to leave any comments, suggestions, or questions in the section below.

--

--

Matias De Santi
Wolox
Writer for

Software Engineer and Infrastructure&Cloud leader at Wolox. I’m passionate about applying new technologies to the projects I work with to get the best result.