Fluent Logging Architecture - Fluent Bit, Fluentd & Elasticsearch

Semih Şenvardar
hepsiburadatech
Published in
7 min readAug 11, 2021

In this article, I will try to explain how we can create solid logging architecture using Fluent Bit, Fluentd, and Elasticsearch.

Photo by Alex wong on Unsplash

If we need to summarize the architecture, Fluent Bit acts as a daemon set in Kubernetes cluster. Runs in every node in Kubernetes collects and forwards stdout logs to the Fluentd, which is located outside Kubernetes cluster. Fluentd modifies, filters and processes these logs and sends them to the Elasticsearch cluster. Finally, Kibana visualizes these logs.

Why does logging matter?

Because logs provide visibility and monitoring into the behavior of a running application. Eventually, every application will crash, a server will go down or users will get frustrated about a bug. If we have a good logging and monitoring infrastructure, the solution to these difficulties I mentioned can be made easier. Logging is one of the most critical aspects of our legacy applications and today’s modern applications. Therefore, we need to consider it carefully.

Why should I use log routers?

Our main motto here is; treat logs as event streams. According to the Twelve-factor app methodology, the application never concerns itself with routing or storage of its output stream. Forwarding and processing logs and storage them shouldn’t be handled by our application. If we use log routers/forwarders, our application will not be deal with unnecessary responsibilities. When our applications are decoupled from the knowledge of log storage and processing, our code becomes simpler.

What are Fluentd, Fluent Bit, and Elasticsearch?

Fluentd is a Ruby-based open-source log collector and processor created in 2011. Fluentd uses about 40 MB of memory and can handle over 10,000 events per second. More than 500 different plugins are available. Fluentd is similar in operation to logstash on the elk stack.

Fluent Bit was developed by the same company as Fluentd for high performance and low memory consumption. It is more suitable for use within the k8s environment. (k8s is short for Kubernetes)You can check out the comparisons of the two here.

Elasticsearch is a distributed, scalable, JSON-based search and analytics engine. It is popularly used as an elk stack (Elasticsearch, Kibana, Beats, and Logstash). Kibana is the visualizing tool for the Elasticsearch data.

“Observability in our software systems has always been valuable and has become even more so in this era of cloud and microservices “ - Martin Fowler

Architecture

Fluent Logging Architecture

Demonstration

In our implementation process, firstly we need a containerized application that sends logs to stdout as JSON. Then we need a Kubernetes cluster, Fluentd installed machine, and elk stack installed another machine. (My suggestion for demonstration is using docker for efk stack. You can find details here). You can use Minikube as a Kubernetes environment. The reason we want to use separate Unix-based servers for Fluentd and Elasticsearch is that any bottleneck in one will not affect the other. In this way, we can achieve a loosely coupled logging architecture.

stdout, refers to the standardized streams of data that are produced by command line programs in Linux and other Unix-like operating systems.

As a starting point, we will add libraries into the dotnet based web API application that will convert logs to JSON. We will use Serilog and its plugins, which is one of the most preferred libraries.

$ dotnet add package Serilog
$ dotnet add package Serilog.Sinks.Console
$ dotnet add package Serilog.Formatting.Elasticsearch
$ dotnet add package Serilog.AspNetCore

After that, we have to configure our Program.cs as below.

You can use the below deployment YAML and service YAML for working the dockerized demo app on your k8s cluster or minikube environment.

Use the below commands to run our dotnet app on k8s environment.

kubectl create -f deployment.yml
kubectl create -f service.yml

You can list the service with kubectl get svccommand and get the port which is mapped to 8080. For reaching our app from a browser, run kubectl cluster-infocommand and get the IP address where k8s master is running, and reach it with pod where the service port is mapped. (Example: k8s master IP is 192.168.57.80 and mapped service port is 32077. You can access the app from => http://192.168.57.80:32077 )

Now it’s time to install Fluent Bit as a daemonset on k8s. First, we will create a new namespace called logging.

kubectl create -f namespace.yml

Next, we will create a service account named fluent-bit and provide identity for the pods.

kubectl create -f serviceaccount.yml

We need to define the cluster role and bind this role to the fluent-bit service account. We can use the YAML files below for these operations.

kubectl create -f clusterrole.yml
kubectl create -f clusterrolebinding.yml

After that, we need to create ConfigMap for Fluent Bit which allows us to decouple environment-specific configuration from our container images. I need to explain some parameters in the configuration file below. In the [INPUT] section, we are using the tail plugin of Fluent Bit which allows us to monitor several text files and reads every matched file in the Path pattern. Kubernetes is storing containers log files in /var/log/containers and these log files are mapped with Path property. With Tag and Tag_Regex configurations, we are setting a tag that will be placed on lines read and regex to extract fields from the file name. In the [OUTPUT] section, we are sending captured logs to Fluentd with forward plugin. At least in the [PARSER] section, we are parsing fields in event records. (In this part, I tried to explain some configuration parameters simply. For detailed documentation, see here.)

kubectl create -f configmap.yml

Logging is a cross-cutting concern. It’s best done in plugin components - Robert C. Martin

Finally, we need to make Fluent Bit run as DaemonSet in Kubernetes. This ensures that all nodes run a copy of a Fluent Bit pod. Also, we need to define the IP address and port information of the Fluentd server as an environment variable in this file.

kubectl create -f daemonset.yml

You can also edit and apply all the above configurations in a single manifest file. I prefer to use separate files to make everything more simple and understandable.

So far so good, but we have some more work to do. We have to configure our Fluentd to capture logs that are Fluent Bit pods forwarding. After that, Fluentd will process, filter, aggregate these logs, and sends them to Elasticsearch. Hang in there, we are almost done!

In this part, we will work with the td-agent. Treasure Agent (td-agent) is a stable distribution package of Fluentd, which is maintained by Treasure Data and the Cloud Native Computing Foundation. (What are the differences between td-agent and Fluentd? Also, I assume you already installed Fluentd) Since we install Fluentd using the td-agent packages, the config file should be at /etc/td-agent/td-agent.conf (If you installed Fluentd using the Ruby Gem or docker container you should have a look here) Fluentd assumes our configuration file has UTF-8 or ASCII encoding. Here is our sample td-agent.conf file :

Let’s try to explain some parameters in the configuration file. source directives determine the input sources.match directives determine the output destinations.filter directives determine the event processing pipelines. (More detailed explanations are here) After this short explanation, by running the following command, we have to reload our td-agent service via systemd. (If you encounter any issues, you can check the logs at /var/logs/td-agent/td-agent.log)

sudo systemctl reload td-agent

Better late than never, our logs should have started flowing into Elasticsearch. Let’s check them out on Kibana. Click on management -> stack managementfrom the sidebar in Kibana. Choose Index Patterns on the screen that opens. (In some installations, you may find this section with management -> index patterns )

When we switch to the Discovery section on Kibana, we can view our logs.

Conclusion

As with any solution in the software world, there are trade-offs in this structure. Fluent Bit could collect, process and filter all logs and forward them directly to Elasticsearch. In this situation, there was a chance that this could bottleneck our Kubernetes cluster as resource consumption when the system is under heavy load. Even a problem with Fluent Bit pods could crash our Kubernetes cluster. That’s why we reduced Fluent Bit’s liability and gave Fluent Bit the task of just forwarding the logs and thus we had a serious memory usage saving in our Kubernetes cluster. In return, we came across the Fluentd server, which we had to organize and maintain.

Fluent Bit and Fluentd are very well-documented solutions. Therefore, if you encounter any problem, I think you can easily solve it. In order not to complicate the article, I did not mention some installations (Fluentd & ELK). If you encounter a problem related to these issues, you can contact me at any time. You can find detailed information about the topics in the article in the references section. Also, you can find a simple demo example here.

If you have read it so far, I thank you for your patience and support. I hope it was useful for you.

--

--

Semih Şenvardar
hepsiburadatech

just some stuff about software development, sharing is caring i guess.