Centralized Logging for Ballerina Applications Running in Kubernetes Using EFK Stack

After deploying many Ballerina Microservices in Kubernetes environment, I’ve been trying setup a centralized logging solution for my Ballerina microservices and ended up with EFK (Elastic, Fluentd and Kibana) stack.

If your application in Docker container writes logs to stdout/stderr, they are persisted in /var/log/containers directory on the Kubernetes Nodes. Ballerina writes logs to stdout/stderr by default. So I’m going to run Fluentd in every Kubernetes nodes, collect logs from /var/log/containers directory and send them to Elasticsearch. And then I’m going to use Kibana to visualize the logs.

Prerequisites

  • Kubernete cluster. If you don’t have a Kubernetes cluster up and running, go grab my Vagrantfile and configure a multi node Kubernetes cluster in VirtualBox.
  • Local Kubectl should have access to your remote Kubernetes cluster. 
    If you haven’t setup your local Kubectl to access your remote Kubernetes cluster, read my previous blog and configure it.

Deploying Elasticsearch

Let’s first deploy Elasticsearch as a StatefulSet.

$ kubectl apply -f https://raw.githubusercontent.com/ecomm-integration-ballerina/efk-stack/master/es-statefulset.yaml
serviceaccount/elasticsearch-logging created
clusterrole.rbac.authorization.k8s.io/elasticsearch-logging created
clusterrolebinding.rbac.authorization.k8s.io/elasticsearch-logging created
statefulset.apps/elasticsearch-logging created

Let’s verify the deployment.

$ kubectl get statefulsets -n kube-system
NAME DESIRED CURRENT AGE
elasticsearch-logging 2 2 3m
$ kubectl get pods -l k8s-app=elasticsearch-logging -n kube-system
NAME READY STATUS RESTARTS AGE
elasticsearch-logging-0 1/1 Running 0 4m
elasticsearch-logging-1 1/1 Running 0 3m

Looks good. Let’s deploy a Kubernetes Service to expose the StatefulSet.

$ kubectl apply -f https://raw.githubusercontent.com/ecomm-integration-ballerina/efk-stack/master/es-service.yaml
service/elasticsearch-logging created

Let’s verify the deployment.

$ kubectl get svc -l k8s-app=elasticsearch-logging -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S)
elasticsearch-logging ClusterIP 10.101.152.251 <none> 9200/TCP

Deploying Fluentd DaemonSet

I’m deploying Fluentd as a DaemonSet, because I want this to be scheduled to run on every node and collect logs from Docker containers which is stored in /var/log/containers directory on the Kubernetes Nodes.

$ kubectl apply -f https://raw.githubusercontent.com/ecomm-integration-ballerina/efk-stack/master/fluentd-es-configmap.yaml
configmap/fluentd-es-config-v0.1.5 created
$ kubectl apply -f https://raw.githubusercontent.com/ecomm-integration-ballerina/efk-stack/master/fluentd-es-ds.yaml
serviceaccount/fluentd-es created
clusterrole.rbac.authorization.k8s.io/fluentd-es created
clusterrolebinding.rbac.authorization.k8s.io/fluentd-es created
daemonset.apps/fluentd-es-v2.2.0 created

Let’s verify.

$ kubectl get DaemonSet -n kube-system -l k8s-app=fluentd-es
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
fluentd-es-v2.2.0 2 2 2 2 2 <none> 58s
$ kubectl get pods -n kube-system -l k8s-app=fluentd-es
NAME READY STATUS RESTARTS AGE
fluentd-es-v2.2.0-4pmd2 1/1 Running 0 5m
fluentd-es-v2.2.0-zx5d9 1/1 Running 0 5m

Deploying Kibana

Let’s deploy a Kubernetes Deployment controller for Kibana.

$ kubectl apply -f https://raw.githubusercontent.com/ecomm-integration-ballerina/efk-stack/master/kibana-deployment.yaml
deployment.apps/kibana-logging created

Let’s verify

$ kubectl get deployments -n kube-system -l k8s-app=kibana-logging
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
kibana-logging 1 1 1 1 4m
$ kubectl get pods  -n kube-system -l k8s-app=kibana-logging
NAME READY STATUS RESTARTS AGE
kibana-logging-56c4d58dcd-7xfmq 1/1 Running 0 1m

Let’s deploy a Kubernetes Service (with NodePort service type) to expose our Kibana deployment.

$ kubectl apply -f https://raw.githubusercontent.com/ecomm-integration-ballerina/efk-stack/master/kibana-service.yaml
service/kibana-logging created

Let’s verify.

$ kubectl get svc  -n kube-system -l k8s-app=kibana-logging
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S)
kibana-logging NodePort 10.99.67.110 <none> 5601:32016/TCP

Looks good, we can access Kibana using http://NodeIP:NodePort. NodePort is 32016 and NodeIP can be any of the Kubernetes Node IPs (192.168.205.10, 192.168.205.11 and 192.168.205.12)

Testing

Refer my previous blog on Deploying Ballerina Microservices in Kubernetes and deploy a sample Ballerina service in your Kubernetes cluster and generate some logs.

Load Kibana dashboard using http://192.168.205.10:32016/app/kibana

You can query all the logs from a namespace using the following filter.

kubernetes.namespace_name:"my-namespace"

My Ballerina service is running in the “utility” namespace, so if I query it using the above filter, I get my Ballerina service logs, as shown below

Likewise you can configure different filters — out of scope of this article.

Summary

Since Ballerina writes logs to stdout/stderr by default, Ballerina service logs are persisted in /var/log/containers directory on Kubernetes Nodes. It is very convenient to run a Fluentd process in every Kubernetes Nodes to collect/format the logs and send it to Elasticsearch and then visualize it using Kibana.

References