Application Logging Using Filebeat and Elasticsearch

Published in

NoBroker Engineering

4 min readJul 9, 2020

NoBroker servers caters to no less than 20 Million HTTP requests on a normal day. Behind the scenes we have 18 micro-services all working in tandem to serve those requests. They generate around 25–30 GBs of daily application and access logs.

Some of the pain points which we were facing,

Log files gets rotated hourly or daily basis, which makes it very difficult to find a particular log from a host of log files. Not to mention how costly that is.
No useful information can be extracted out from raw data i.e. comparing number of error logs occurred between last week and current week, Error logs generated by a given class on a given day.
Authorisation issue where we didn’t want everyone in the team to have access to our production VMs.

We at NoBroker are in love with Elasticsearch and its ability to integrate sub-systems like APM and Filebeat without having any impact on its efficiency and speed. It gave birth to TopGun. Our one-stop shop solution to all the problems which we discussed above and more.

Filebeat will sniff the log files and push it to the elastic cluster on log by log basis.
Elastic then converts each log into a document using a pipeline and push it to an elastic index.
Kibana helps in visualising and querying the logs.

Filebeat Configuration:

You can download Filebeat from it official site. You can have it up and running after providing details about your log files and elastic cluster in the filebeat.yml file.

- type: log
   enabled: true
   paths: #1
     - /var/log/nobroker/application.log
   fields: #2
     type: "admin-logs"
   pipeline: "admin-logs" #3
   multiline.pattern: '\d{4}\-\d{2}\-\d{2}\s\d{2}\:\d{2}\:\d{2}\,'#4
   multiline.negate: true
   multiline.match: after
 output.elasticsearch:
   # Array of hosts to connect to.
   hosts: ["localhost:9200"]
   index: "nb_logs-%{[fields.type]}-%{+yyyy.MM.dd}"
   username: "elastic"
   password: "changeme"

paths contains one or more path of your log file.
fields is a custom filed you can send with each log to Elastic which will then be pushed into your documents.
pipeline has the name of the elastic pipeline, which will transform you single line of log into a document.
multiline-pattern is a regex which is used by Filebeat to split between multiple logs. This is important since your single error log will mostly be spread into more than one line.

Elasticsearch Pipeline:

Pipeline provides with processors which can be used to modify the logs while creating a document. In our case we used grok expressions in the processor to extract and label data in our logs.

Grokdebug is an amazing tool to create and test out your grok patterns.It has a collection of frequently used grok patterns which are augmented by powerful debugging capabilities.

Below is a sample of a typical Java based error log and its possible grok pattern.

Here is the log statement after applying the grok pattern on it.

{
  "timestamp": [
    [
      "2020-06-01 22:17:40,903"
    ]
  ],
  "POD": [
    [
      "no-c2c2c2-h3h3"
    ]
  ],
  "USERNAME": [
    [
      "no-c2c2c2-h3h3"
    ]
  ],
  "Request": [
    [
      "cb32-k98-0-b-aa6f0d6fb201"
    ]
  ],
  "log": [
    [
      "qtp17203-77777"
    ],
    [
      "ERROR"
    ],
    [
      "in.nobroker.TestController",
      " [[TestCont]] : saving transaction error\njava.lang.UnsupportedOperationException: Attempted to serialize java.lang.Class: org.hibernate.proxy.HibernateProxy. Forgot to register a type adapter?\n\tat com.google.gson.internal.bind.TypeAdapters$1.write(TypeAdapters.java:69) ~[gson-2.4.jar:na]\n\tat com.google.gson.internal.bind.TypeAdapters$1.write(TypeAdapters.java:63) ~[gson-2.4.jar:na]\n\tat com.google.gson.internal.bind.TypeAdapterRuntimeTypeWrapper.write(TypeAdapterRuntimeTypeWrapper.java:68) ~[gson-2.4.jar:na]"
    ]
  ]
}

Logs Visualisation:

Kibana is a visualisation tool provided by Elasticsearch. It is really easy to configure your Kibana with an Elasticsearch cluster. You can download kibana from here.

You need to create an index-pattern before you can visualise you logs. Index pattern is just a prefix of your Elastic index which contains your logs.

You can create index-pattern on Kibana (Management -> Index Patterns -> Create index pattern)

There you have it. Your own logs management system which is easy to use, scalable and secured.

Alternative:

As an alternative you can use Logstash instead of Filebeat. It bundles file sniffing and pipelining of the logs. It’s easy to setup if your are already familiar with Elasticsearch. You can follow these examples to quickly set it up.

Why Filebeat:

Logstash needs a JVM to process and transform log lines into meaningful objects which can then be shipped to Elastic for indexing, which means in ELK all the heavy lifting is done by Logstash. On the other hand Filebeat is super lightweight, all it has to do is sniff the files and separate logs from lines.

This is a part of a series of blogs, TopGun is so awesome that one blog could not do justice to what it does for us. We will soon be sharing the part 2 where we talk about application metrics and monitoring using TopGun. We will add the links here when it is published meanwhile here is a sneak peek of what is coming.