Analyzing logs with ELK: easy way

Published in

DevOops World … and the Universe

8 min readApr 26, 2016

Intro

Initially the idea to visualize Apache Traffic server logs came from this post. Specifically, I wanted to visualize a hit ratio, OS families and browsers from Apache Traffic server access log. For this purpose I chose to use ELK stack which is a pretty useful toolset to aggregate, store and visualize data from logs. For people who are new to this world of log management acronym ELK hides a range of tools: Elasticsearch is a real time search server based on Lucene, Logstash - the tool for managing events and logs, and Kibana - the data visualization platform.

Components

For all components delivery I used the docker. I wanted to have the stack that will be easy to install and configure, so I chose containerization approach with Docker as basic format. Links to sources:

Elasticsearch -https://github.com/7insyde/env-elk-templates/tree/master/env-elastic
Logstash - https://github.com/7insyde/env-elk-templates/tree/master/env-logstash
Kibana - https://github.com/7insyde/env-elk-templates/tree/master/env-kibana

In high level it looks like this:

                 +-------------------------------------------------+
                 |              ELK stack                          |
                 |                                                 |
+--------------+ | +----------+    +---------------+    +--------+ |
| Docker log +----->          +---->               +---->        | |
+--------------+ | | Logstash |    | Elasticsearch |    | Kibana | |
                 | |          |    |               <----+        | |
                 | +----------+    +---------------+    +--------+ |
                 |                                                 |
                 +-------------------------------------------------+

I send docker logs with log driver to gelf (graylog extended log format which can be used for Logstash) Logstash input on port 12201, where Logstash begins processing them and puts them into elasticsearch and then we have an access to them via kibana.

With the Docker I only need to add options to docker run command:

... --log-driver=gelf --log-opt gelf-address=udp://<host:port> ...

where <host:port> is Logstash host and port where it will be listening.

L - Logstash

For Logstash I needed to configure input for gelf, output for elasticsearch, and filter.

In general, Logstash configuration file has the structure with three sections:

input { ... }
filter { ... }
output { ... }

In my case I needed to add gelf input, so we have this as an input:

input {
    gelf {
        type => gelf
        port => 12201
    }
}

next, I needed to process it. I used the filter section to parse custom access.log which originally comes from Apache Traffic server:

filter {
    if [type] == "gelf" {
        grok {
            match => { "short_message" => "%{COMBINEDAPACHELOG:log} %{NOTSPACE:content} %{NOTSPACE:cachecode}" }
        }
        grok {
            match => { "agent" => "%{NOTSPACE:browser} (?<os_family>\([a-zA-Z0-9:;-_.].+\)) %{NOTSPACE:web_kit} %{NOTSPACE:browser_ver}" }
        }
        mutate {
            gsub => [
                "browser", "\", "",
                "browser_ver", "\"", "",
                "os_family", "[()]", ""
            ]
        }
    }
}

I used a grok built-in filter plugin to split the message to the default COMBINEDAPACHELOG fields, then I used the second grok to split “agent” field, which I received from the previous step, to “browser”, “os_family”, “web_kit”, “browser_ver” fields. I also did mutating and removed unnecessary characters, such as parentheses and quotes.

The next section is an output. I used the built-in Elasticsearch output plugin:

output {
    if [type] == "gelf" {
        elasticsearch {
            hosts => ["<elastic host>:<elastic port>"]
            index => "gelf-%{+YYYY.MM.dd}"
        }
    }
}

The “hosts” is array with Elasticsearch host and port, and we need index field to know what index to use when we need our logs.

E - Elasticsearch

Elasticsearch can be run in a separate container, just for test purposes. But I didn’t know what load could be and to have ability to easily scale it I ran three nodes: one for an elasticsearch client, one for the master and one for the data node. To do this I had to run each of them with different environment variables and forward ports. For the master node:

docker run -d -e ELASTIC_NODE_MASTER=true -p 9300:9300 quay.io/7insyde/elastic

Where environment ELASTIC_NODE_MASTER indicates to Elasticsearch that this instance will be the master node, which controls the cluster and decides which shards to allocate to which nodes, and forwards port 9300 for transport module.

For the data node it will be:

docker run -d -v /data/elastic:/data/elastic -e ELASTIC_NODE_DATA=true -p 9300:9300 quay.io/7insyde/elastic

Where we have another environment ELASTIC_NODE_DATA for the data node, which holds data and performs data related operations such as CRUD, search, and aggregations. Not to lose the data, I mount elastic storage to /data/elastic on the host and forward 9300 port for transport module.

For the client node:

docker run -d -e ELASTIC_HTTP_ENABLE=true -p 9200:9200 -p 9300:9300 quay.io/7insyde/elastic

Where environment ELASTIC_HTTP_ENABLE indicates it is the client node which forwards cluster-level requests to the master node or to data nodes. Except forwarding transport port 9300 we also need to forward 9200 HTTP port for client, which we used in Logstash configuration.

Now the question may appear: how nodes will find each other. It can be done in several ways, they are described in Elasticsearch documents. I didn’t change the discovery module settings, so the default is Zen Discovery, as documents says - it provides an unicast discovery and the nodes will find each other via Ping sub module. With basic config it didn’t work, so I added the interface name which the elastic node will use.

discovery.zen.ping.multicast.address: _eth0:ipv4_

Also I added the default type of an analyzer:

index.analysis.analyzer.default.type: keyword

If no analyzer is specified explicitly, the default one is a standard analyzer. And if we don’t specify the default type, the fields which Elasticsearch creates will be analyzed by default. This means that each string type will be splitted by the analyze type rules and Kibana will show this not like we need it.

This is enough to have Elasticsearch cluster, which we’ll use for our data running and working.

K - Kibana

Kibana is a perfect match for our stack because it was designed for Elasticsearch. To run Kibana in the docker, I need to specify “elasticsearch.url” in kibana.yml. Not to hardcode it, I decided to use sed to replace the value in kibana.yml taking environment values HOST and PORT of elasticsearch, and forward port for web ui 5601:

docker run -d -e HOST=<elastic_host> -e PORT=<elastic_port> --name kibana -p 5601:5601 quay.io/7insyde/kibana

When I go to localhost:5601 in the browser and I see configuration option, index name will be gelf-* as we configured in Logstash output, and time-filed name @timestamp.

Now we can see messages that are coming from Traffic server:

To visualize cachecodes that tell us what requests Traffic server returns from cache - HIT and what was missed and forwarded to origin - MISS:

To visualize OS families and browsers, I created another visualization:

Deployment

While testing and building containers I managed them manually, mostly launching one by one and connecting them to each other. When the whole ELK stack was up and running, I needed a tool to easily manage it. Docker-compose was the tool that came to my mind. It uses the Docker API, and is quite simple to use at the same time being really functional. I created the docker-compose.yml file, and launching process suddenly became fast and easy. Just run:

docker-compose up -d

from the directory with the docker compose file to launch it up in the detached mode, and the command to shut it down.

But launching it locally won’t help me much to manage and gather some logs from other services that are not being launched locally as well. I had to move it somewhere to cloud, and Amazon ECS was good solution. I got a possibility to simply create EC2 instance and launched docker-compose there, but this solution wouldn’t be so flexible, cause it duplicates my local launch just being moved to EC2 instance. That’s why I’ve made a decision to set it up on Amazon ECS.

Compose + AWS

Launching process didn’t go as smoothly as I expected, the problem was that ECS used its own API, and one couldn’t simply export docker-compose.yml file over the AWS console. It could be done only by using AWS-CLI. The ecs-cli syntax is quite simple to create a task definition you need:

ecs-cli compose --verbose --file docker-compose.yml --project-name 7insyde create

where - “--project-name” is your task definition project name, and ”--file” path to your docker-compose.yml file. But when I tried to do it on my own, I faced AWS errors saying that my .yml file is invalid, despite it being valid. I managed to find this problem solution on github. Adding two spaces in front of the “-” fixed the issue. Check it out as well: ecs parsing issues. That was the reason why I created ecs-compose file, which is the same as docker-compose but made for ECS. Now this line will do the trick for me, and task definitions created on ECS.

ecs-cli compose --verbose --file ecs-compose.yml --project-name 7insyde create

Esc-cli compose works only with docker compose files version 1, so I created docker-compose-v2.yml for local purposes.

Gelf log-driver

When I tried to launch my ELK stack, I faced a problem that Logstash container failed to run, because gelf log-driver was not supported. In fact it was, but you had to specify it within your ECS-instance. Just ssh to your ECS instance machine, and made the following changes to ecs.config file:

ECS_CLUSTER=seven-insyde
ECS_AVAILABLE_LOGGING_DRIVERS=["json-file","gelf","fluentd"]

You should specify all log-output drivers as an array. Just because I am using json and Fluentd, I added them to ecs.config as well. After that, just restart the ECS agent, and it should be fine. Since task definitions were already created, it was a really simple part to launch it again. I created ECS service to start ELK automatically, but you may want to run it as a single task.

Summary

I need to say that ELK stack is a very flexible and scalable set. Each of the components provides a great set of options which can be extended for production use. Developers of ELK provides a lot of examples in elastic github.
With tools such as docker and docker compose the running and deploying of the stack becomes easy and very convenient. And sure I think it is not an ideal solution and can be improved or fully reworked.