The LOG Battle: Logstash and Fluentd

Computing systems generate log files. The need for these logs is to track a specific change or event that has happened in the system or on an environment. When the system scales it will be hard to track the log files that are generated, this is the point where every organisation has to think about systematic log management approach.

Centralised Log Management is an approach where the logs that are generated from sub-systems within the same environment is collected, parsed and stored in a central repository in an organised fashion thereby reducing the overall management effort to identify an issue. There are various open source as well as paid tools available in the market to accomplish this. Some of the examples are Splunk, LogRythm, LogPacker, Logstash, Fluentd etc.

In this blog, we will be comparing two popular open source log management tools that can be used to reduce the burden of managing logs.

Logstash and Fluentd

Logstash is part of the popular ELK stack. It is an open source data processing pipeline that can be used for collecting, parsing, and storing logs from different sources. These logs can be either indexed in Elastic search or can be pushed to a storage. Logstash practically accepts inputs from various sources and has more than 50 input plugins which helps the user connect with platforms, databases and applications.

Fluentd is also an open source data collector that can collect, parse, transform and analyze data and then store it. It is a project of the Cloud Native Computing Foundation (CNCF) and has a large number of input plugins that can be used to connect with various platforms for accepting data.

No matter whether you have a Windows or Linux workload both Logstash and Fluentd are ready to be run. Apparently, both do come under the Apache License v 2.0.

How are they different?

Now that we have an idea on what these things are, let us dive a little deep into the technology behind it. Here in this section, we will try to understand how they are different.

1. Written Language:

Logstash is written in JRuby. JRuby is a Java implementation of the Ruby programming language. This means that your virtual machine should have Java runtime for Logstash to work whereas Fluentd is written in CRuby. CRuby is the C implementation of Ruby programming language.

2. Routing an event:

When it comes to event routing, Logstash and Fluentd have different approaches.

In Logstash, routing an event is through writing if-then statements whereas in Fluentd the routing is based on tags. For a programmer it will not be much a hassle to write statements in Logstash but Fluentd has a more straight-forward approach.

3. Plugins:

Fluentd is the leader in number of plugins available compared to the Logstash but there is no centralized repository for accessing these plugins whereas in Logstash you have access to all the plugins in the GitHub repository.

4. Performance:

There is no differentiator that states one of them is better than the other apart from the fact that Logstash consumes more memory compared to Fluentd. But both these tools offer lightweight shippers to mitigate this issue.

a. Filebeat

Filebeat is a lightweight logshipper for logstash. It can be installed as agents on your servers to collect operational data. Since it is lightweight it does not consume system resources compared to Logstash. It has a number of modules that can help in parsing, collecting and visualization of logs. Moreover, it is container ready so in case if you are looking to build a container based solution you can run Filebeat inside a container in the same Virtual Machine.

b. Fluentbit

Fluentbit is lightweight shipper which allows collection of data form different sources and send them to multiple locations. Fluentbit is also container ready.

Which one to use?

This really is a tough call! Both Fluentd and Logstash have a very active community so you will have a number of plugins to choose from depending on your requirement. Considering the event routing approach if you are building a complex solution Fluentd will be more convenient for you. Since Fluentd is compatible with both Elasticsearch and Kibana you can use it with both of them to search and dashboard your data. Otherwise you can go with the native ELK stack.