Structured Logging in Rails — Part 1

Sumeet Gujrati
Upstart Tech
Published in
5 min readAug 18, 2020

At Upstart, we’ve been investing in common tools to help enable developers building new services to adopt common patterns/practices. In Part 1 of the blog, we will discuss Rails logger, its shortcomings, and how to update it to use structured logging. In Part 2, we will discuss the same for Sidekiq logger.

The Problem

When you collect logs for processing by a log analysis tool, logging in a known format, such as JSON or XML makes processing easier and more efficient. When you log in structured format, log analysis tools use structure appropriate parsers which are based on state machines. In contrast, processing plain text logs requires the use of regular expressions, which are computationally expensive and brittle. Consider this example of an unstructured log:

[2018-05-13T17:51:08.772542] [INFO] [127.0.0.1] [2892] Customer registration error - SSN already exists.

If, say, you wanted to search for customer registration errors, you’d use the regular expression .*Customer registration error — (.*) to search the logs. The same log, if logged in JSON format may look like as follows:

{    
"timestamp": "2018-05-13T17:51:08.772542",
"source": "127.0.0.1",
"msg": "SSN already exists.",
"level": "INFO",
"user_id": 2892,
"error_type": "customer_registration"
}

Finding the customer registration errors is a much simpler and durable task of selecting all the JSON logs where the field error is equal to customer_registration. The more structure you have in your log messages, the more your log analysis tool can take advantage of that structure and perform its tasks with greater speed and accuracy while using fewer resources.

Default Rails Logger

At Upstart, we use SumoLogic® as our log analysis tool. Although it is fully capable of parsing unstructured logs, the preferred way is to log in a structured format because of the advantages outlined above. Rails default logging has various issues including but not limited to:

  1. It logs messages in an unstructured format.
  2. It logs HTTP requests, but not responses that may aid in debugging efforts. Moreover, its request logs are spread over multiple log messages, such as one message at the beginning of the request, one message when the request is being processed, and one message when the request is complete.
  3. It does tag logs with information such as request id, user id, and source IP address. However, it does so in an unstructured format, and tags are not labeled.

In order to use structured (request/response and tagged) logging in Rails applications, we implemented two Rack middleware¹ packages, and a structured logger that logs in JSON format. We chose JSON as that is the most commonly used data-interchange format. All three of these are required to achieve the desired outcome. We utilized two existing Gems, Ougai² and Logstop³ to implement the structured logger. Ougai provides a logger capable of logging in JSON format and Logstop provides functionality to redact sensitive fields from JSON log messages. In the following sections, we discuss the implementation and usage of the two Rack middleware packages and the logger.

Structured Logging Implementation

As mentioned earlier, we implemented two Rack middleware packages:

  1. Request and response middleware — used to log request and response in a single JSON formatted log message, and
  2. Tagged logger middleware — adds tags to the request’s Thread.current hash so that the structured logger can include them with each log message.

We also implemented functionality to redact sensitive fields such as password, ssn etc. from the log messages. The source code can be found here for you to download and modify as needed. All examples in the following sections assume that the source code is saved in structured_logger sub-directory of application’s lib directory.

Request/Response Middleware

This middleware is responsible for logging the complete request and response. You can use this middleware in config/application.rb as follows:

require Rails.root.join("lib", "structured_logger", "structured_logger")config.middleware.use StructuredLogger::RequestLogger,          
custom_fields: %w[custom_field1 custom_field2],
app_name: "<< app name >>"

custom_fields is an array of fields that are redacted from the log messages. By default, request/response logger redacts the following fields from log messages:

password name first_name last_name dob date_of_birth ssn email address  street_address street phone

A sample request/response log looks like:

{   
"timestamp": "2020-04-22T19:25:14.712080Z",
"app_name": "<< app name >>",
"level": "INFO",
"context": "request_response",
"status": 200,
"url": "/company_query?search=Google%2C%20Inc.",
"query_string": "search=Google%2C%20Inc.",
"user_id": 1122,
"duration": 0.174971,
"ip": "0.0.0.0",
"request_method": "GET",
"x_request_id": "123fs7fc-f733-48ab-870c-ada26dh564hs",
"request": {
"email": "[FILTERED]",
"password": "[FILTERED]"
},
"response": {
....
}
}

If you use request/response middleware, you should disable Rails default logging on per-action basis using Silencer⁴ middleware in config/application.rb as follows:

require "silencer/logger"  Rails.application.configure do   
config.middleware.swap(
Rails::Rack::Logger,
Silencer::Logger,
config.log_tags,
silence: [%r{^/<< actions that need to be silenced >>/}]
)
end

Tagged Logger Middleware

This middleware inserts additional information in the request’s Thread.current hash, so that they can be logged by the structured logger with each log line. This middleware can be used in config/application.rb as follows:

require Rails.root.join("lib", "structured_logger", "structured_logger")  config.middleware.use StructuredLogger::TaggedLogger,   
custom_tags: %w[custom_tag1 custom_tag2]

custom_tags is an array of tags that the structured logger logs with each log message. By default, it logs x_request_id, which is the value of action_dispatch.request_id and source (file name and line number where log message is logged). If you want to log additional tags, i.e., request headers, with the log messages, then you can pass an array as the value of custom_tags.

Structured Logger

We use an Ouagi logger instance to log the messages in JSON format. It also adds tags specified in tagged logger middleware to the log messages. The logger can be initialized with custom_fields and custom_tags, which have the same meaning as specified in the request/response middleware and tagged logger middleware respectively. For the structured logger to tag logs properly, the value of custom_tags passed to it should be the same as the one used with tagged logger middleware.

Usage

You can use both Rack middleware packages and the structured logger in config/application.rb as follows:

require Rails.root.join("lib", "structured_logger", "structured_logger")  app_name = "my_app" 
custom_fields = %w[field1 field2]
custom_tags = %w[tag1 tag2]
# tagged logger middleware
config.middleware.use StructuredLogger::TaggedLogger,
custom_tags: custom_tags
# tagged logger instance
logger = StructuredLogger::Utils.tagged_logger(
app_name: app_name,
custom_fields: custom_fields,
custom_tags: custom_tags
)
logger.level = Logger::INFO
config.logger = logger
# request/response middleware
config.middleware.use StructuredLogger::RequestLogger,
custom_fields: custom_fields, app_name: app_name

The Rails default tagged logger should be disabled, otherwise you will notice that some of your JSON formatted log messages are converted to string before being logged. The only way we found to disable the default Rails logger is to override ActiveSupport::TaggedLogging::Formatter.module_eval.call method using an initializer, like this:

require "active_support/core_ext/module/delegation" 
require "active_support/core_ext/object/blank"
require "logger"
require "active_support/logger"
ActiveSupport::TaggedLogging::Formatter.module_eval do
# The default implementation of call method is
#
# def call(severity, timestamp, progname, msg)
# super(severity, timestamp, progname, "#{tags_text}#{msg}")
# end
#
# Note that, it attaches tags and converts msg to string.
def call(severity, timestamp, progname, msg)
super(severity, timestamp, progname, msg)
end
end

It is recommended that when a service calls another service, the caller service should pass X-Request-Id header to the other service, so that you can trace requests end to end.

Rails applications make heavy use of Active Job, a framework for declaring jobs and making them run on a variety of queuing backends. Active Job can be configured to work with Sidekiq, Resque, Sneakers and many other backends. In the second part of the blog, we will discuss how to structure Sidekiq logs.

References

  1. Rack middleware — https://guides.rubyonrails.org/rails_on_rack.html
  2. Ougai — https://github.com/tilfin/ougai
  3. Logstop — https://github.com/ankane/logstop
  4. Silencer — https://github.com/stve/silencer

--

--