Publish logs to kafka with filebeat

Filebeat to kafka with docker


I have been working with building some research product to store server logs in a secure storage(obviously in a blockchian :)). We named this project as logikos. logikos is a greek work which means sensitive in english :). As a part of this project we had to take logs from different services and store in cassandra. In this case, first I have taken different service logs from filebeat to kafka. Then I have streamed the data in kafka to cassandra with using akka streams(I have written about that story in here)

Filbeat with kafka

We can configure filebeat to extract log file contents from local/remote servers. Filebeat guarantees that the contents of the log files will be delivered to the configured output at least once and with no data loss. It achieve this behavior because it stores the delivery state of each event in the registry file.

Normally filebeat integrates with logstash on ELK stack. Filebeat listen for new contents of the log files and publish them to logstash. Logstash filter and publish the to elasticsearch. Then kibana will display them on the dashboard. Now the latest version of filebeat supports to output log file data directly to kafka.

In this post I’m gonna show how I have integrated filebeat with kafka to take the logs from different services. All the source codes which relates to this post available on beats gitlab repo. Please clone it and follow the below steps. I’m running all my services with docker.

1. Kafka/Zookeeper docker-compose

Following is the content of docker-composr.yml which correspond for running zookeeper and kafka.

2. Run kafka/zookeeper

I can start zookeeper and kafka with below commands. In here I’m running single instance of zookeeper and kafka. Kafka will be available with 9092 port on my local machine.

docker-compose up -d zookeeper
docker-compose up -d kafka

3. Filebeat docker-compose

Following is the content of docker-compose.yml which correspond for running filebeat.

In here I have defined four docker volumes, filebeat config file, filbeat data directory, and two service log files which exists on my local machine. In filebeat config I have specified these log files as the inputs(prospectors).

4. Filebeat config

Filebeat config file existing in /usr/share/filebeat/filebeat.yml inside the docker container. I’m replacing that file content with filebeat.yml file which exists in my local directory(with docker volume). It defines all the log file configuration and kafka output configurations.

5. Run filebeat

Now I can start filebeat with below command. It will start to read the log file contents which defined the filebeat.yml and push them to kafka topic log.

docker-compose up -d filebeat

6. Test with kafkacat

Now filebeat publishing the log file content into log kafka topic. In order view the publishing content we can create a kafka console consumer with kafkacat command. You can take kafkacat from here.

# kafka broker -
# kafka topic - log
kafkacat -C -b -t log