Kubernetes Logging Stack

Published in

NSoft

5 min readJun 20, 2019

Mandatory log picture, get it? LOGGING, GET IT?!

A big part of the new cloud narrative is the observability which encompasses monitoring, logging and tracing with a single purpose of gaining visibility into the performance of your apps and infrastructure.

How to handle logging in Kubernetes has long been a topic of discussion here and there isn’t a simple answer. I personally have searched for “best practices” and failed. It’s all up to you. Act according to your needs and circumstances.

At first, we used our Graylog centralized setup which was already serving our existing infrastructure and just deployed Fluentd into our cluster. The issue here was that we had a bunch of logs streaming to our Graylog in high volume. This included Ambassador API Gateway, Istio, Prometheus and all other infrastructure apps. You can only imagine the volume of such streams.

In order to rationalize the process, our decision was to deploy a local stack into the cluster which will collect all the logs from the cluster (including Kubernetes API events) and store them there. All other application logs can still be forwarded to Graylog if they are critical for storage.

While searching for a complete solution I ran into many guides on how to deploy an ELK stack for logging but something was always missing. This provides a simple and robust way to deploy such logging stack by using a single helm chart (which is just a meta chart that combines other charts).

Components Used

The first part of the stack is Elasticsearch which is a distributed search and analytics engine used worldwide mostly with Elastic Stack (Elasticsearch, Logstash and Kibana — ELK) for open source logging purposes. You can read more at Elasticsearch logging.

Here I am using a helm chart provided by Elastic for deploying Elasticsearch. There is also a chart in stable/charts but it is scheduled for depreciation, info here. Similar to Elasticsearch, there is a new chart from Elastic for Kibana which is going to be used for this.

There is an elastic operator that is being developed by Elastic which may be used in the future but right now it is still in alpha. It is capable of deploying Elasticsearch and Kibana using Kubernetes CRDs and promises to provide a way to dynamically scale storage which is not possible with the chart. I will update when that matures a little. More info on the operator here.

Next component used is elasticsearch-curator, designed do rotate older indices in order to save space (logs can take a lot of space and retention policy varies from company to company. Our Graylog setup is used for long term storage and this logging stack deploy is more of a short term storage). It also has an option of archiving older logs to S3 but that is not the scope of this article.

In order to read logs from the container stdout and ship them to Elasticsearch nodes, fluent-bit is used. Additionally, to read Kubernetes API logs we use metricbeat.

Metachart

It’s called a metachart because it doesn’t have any templating implemented and it is being used just to enable deploying the whole stack with a single chart that has others in its requirements. The chart is located here.

The idea is that by using requirements one can still use the chart values for each component of the stack enabling deployment customization according to needs using requirement aliases as shown:

dependencies:
 — name: elasticsearch
 version: 7.1.1
 repository: https://helm.elastic.co
 alias: elasticsearch
 — name: elasticsearch-curator
 version: 1.5.0
 repository: https://kubernetes-charts.storage.googleapis.com
 alias: curator
 — name: kibana
 version: 7.1.1
 repository: https://helm.elastic.co
 alias: kibana
 — name: fluent-bit
 version: 2.0.5 
 repository: https://kubernetes-charts.storage.googleapis.com
 alias: fluentbit
 — name: metricbeat
 version: 1.6.4
 repository: https://kubernetes-charts.storage.googleapis.com
 alias: metricbeat

Where versions are defined, the chart can be updated using:

helm update deps

to fetch the latest versions (if explicitly set in requirements.yaml file) from upstream charts. This saves time from maintaining forks.

One can also use values from upstream charts in valuefile by nesting them under the defined alias for each requirement:

elasticsearch:
 clusterName: “logger-elasticsearch”
 nodeGroup: “master"

which enables you to use all the options that the chart maintainers created.

Deploying the stack

The logging stack is deployed by cloning the chart repo:

git clone git@github.com:volatilemolotov/k8s-logging.git

and entering the chart directory:

cd k8s-logging/logger

The next step is to edit the valuefile in order to set up the stack. The provided default values are more than enough to run the stack. If you choose to edit you can read about options on the respected chart repos:

Elasticsearch

Kibana

Elasticsearch-curator

fluent-bit

Metricbeat

One thing to note is that you have to change the valuefile for the elasticsearch-curator, because the image used for curator is no longer maintained and does not work with Elasticsearch 7.1.1.
You can use the following image. I have just forked the elasticsearch-curator git and set up an automated build on Quay.io. Also, remember to change the command used.

curator:
 cronjob:
 schedule: “0 1 * * *”
 serviceAccount:
 create: true
 image:
 repository: quay.io/volatilemolotov/curator
 tag: v5.7.6
 pullPolicy: IfNotPresent
 hooks:
 install: false
 upgrade: false
 dryrun: false
 command: [“/curator/curator”]
 configMaps:
 ---

Next step is to install the chart:

helm upgrade — install logger . — namespace logging -f values.yaml

and wait until all the components are running.

After that port-forward into the Kibana pod and go to localhost:5601.

When starting Kibana for the first time, index patterns are required to be set up. This can be done by going to the Discover panel where you are presented with a list of indices. Create two : kubernetes_cluster* and kubernetes_events*. Select @timestamp for Time Filter field name for both.

Storage Caveat

Given that the Elasticsearch chart uses StatefulSet for deployment, it is not possible to change the size of the Persistent Volume Claim. In order to do this you need to manually edit the claim (all three of them if deploying from standard values) and wait until the resize is finished. Use:

kubectl get pvc logger-elasticsearch-master-logger-elasticsearch-master-0 -o yaml

to check the status of the PVC. When the resize action is done, you will get the following message:

message: Waiting for user to (re-)start a pod to finish file system resize of 
 volume on node.

After this message, you need to reset the pod which is using that disk. Make sure you reset the pods one by one.

If you wish to contribute, you can join us in our open source projects, or you can apply for a job at NSoft.

Kubernetes Logging Stack

Components Used

Metachart

Deploying the stack

Storage Caveat

Written by Vlado Đerek