How to create custom indices based on Kubernetes metadata using fluentd?

Published in

FAUN — Developer Community 🐾

3 min readJan 16, 2020

In the previous articles, we learned about setting Fluentd on Kubernetes with the default setup config. Now in this article, we will learn how to create custom indices using Fluentd based on Kubernetes metadata and tweaking an EFK stack on Kubernetes.

Here, I will be using the Kubernetes metadata plugin to add metadata to the log. This plugin is already installed in the Docker image (fluent/fluentd-kubernetes-daemonset:v1.1-debian-elasticsearch), or you can install it using “gem install fluent-plugin-kubernetes_metadata_filter” into your Fluent docker file. Use the filter in Fluentd config value to add metadata to the log.

# we use kubernetes metadata plugin to add metadatas to the log
    <filter kubernetes.**>
        @type kubernetes_metadata
    </filter>

Here, our source part is the same as we used in setting Fluentd on Kubernetes with the default setup config. I will customize the matching part in the default config and create a custom index using Kubernetes metadata. So here we are creating an index based on pod name metadata. The required changes are below into the matching part:

# we send the logs to Elasticsearch
    <match kubernetes.**>
       @type elasticsearch_dynamic
       @log_level info
       include_tag_key true
       host "#{ENV['FLUENT_ELASTICSEARCH_HOST']}"
       port "#{ENV['FLUENT_ELASTICSEARCH_PORT']}"
       user "#{ENV['FLUENT_ELASTICSEARCH_USER']}"
       password "#{ENV['FLUENT_ELASTICSEARCH_PASSWORD']}"
       scheme "#{ENV['FLUENT_ELASTICSEARCH_SCHEME'] || 'http'}"
       ssl_verify "#{ENV['FLUENT_ELASTICSEARCH_SSL_VERIFY'] || 'true'}"
       reload_connections true
       logstash_format true
       logstash_prefix ${record['kubernetes']['pod_name']}
       <buffer>
           @type file
           path /var/log/fluentd-buffers/kubernetes.system.buffer
           flush_mode interval
           retry_type exponential_backoff
           flush_thread_count 2
           flush_interval 5s
           retry_forever true
           retry_max_interval 30
           chunk_limit_size 2M
           queue_limit_length 32
           overflow_action block
       </buffer>
    </match>

I will use Kubernetes metadata in logstash_prefix “${record[‘kubernetes’][‘pod_name’]}” to create an index with the pod name. You can also create an index with any Kubernetes metadata ( like namespace & deployment ). And here, you can tweak some configuration for logging data to ES as per your need. Suppose I don’t want to send some unwanted logs to ES like Fluentd, Kube-system or other namespace containers’ logs, so you can add these lines before the Elasticsearch output:

    <match kubernetes.var.log.containers.**kube-logging**.log>
    @type null
    </match>
    <match kubernetes.var.log.containers.**kube-system**.log>
    @type null
    </match>
    <match kubernetes.var.log.containers.**monitoring**.log>
    @type null
    </match>
    <match kubernetes.var.log.containers.**infra**.log>
    @type null
    </match>

Here, we have the final Kube-manifest config map, we just need to run it and apply the changes to the k8s cluster and rolling deploy the existing Fluentd.

$ kubectl apply -f fluentd-config-map-custome-index.yaml

After applying the changes, now we have indices with pod names which can be seen in ES and Kibana.

I hope this blog was useful to you. Looking forward to claps and suggestions. For any queries, feel free to comment.

For more related content, visit https://opendevops.in/

Don’t forget to check out my other posts:

Follow us on Twitter 🐦 and Facebook 👥 and Instagram 📷 and join our Facebook and Linkedin Groups 💬.

To join our community Slack team chat 🗣️ read our weekly Faun topics 🗞️, and connect with the community 📣 click here⬇

How to create custom indices based on Kubernetes metadata using fluentd?

If this post was helpful, please click the clap 👏 button below a few times to show your support for the author! ⬇

Written by Anup Dubey