Application Logging Simplified in Kubernetes: Part 1

Node-Level Logging

Gursimran Singh

Published in

Hashmap, an NTT DATA Company

5 min readDec 10, 2020

Logging is one of the most critical parts of all production-grade applications and services.

With the rapid growth of the micro-services architecture, we have many services running in environments that add to the complexity of the collection and unification of the logs.

In this series of blog posts, we will discuss how to collect and store application logs in Kubernetes in a simple but effective way. This is a 3 part series in which we will explore different methods for logging under Kubernetes.

3 approaches can be used for log collection in Kubernetes:

Node-Level Logging
Sidecar Container
Direct Logging

This first blog post will focus on Node-Level Logging.

Node-Level Logging

The easiest and most embraced logging method for containerized applications is to write to the standard output and standard error streams.

Kubernetes collect all the Logs generated by the containers in the form of std-out and std-err.

Kubernetes stores logs for each container at a node level, which will then be collected by a logging agent and stored in a Logging Back-end for persistence.

Architecture

https://kubernetes.io/docs/concepts/cluster-administration/logging/

In this architecture, we will collect container logs and store those logs in Google Cloud Storage (GCS) with Fluentd as a Logging Agent.

Kubernetes cluster generally consists of more than 1 node, so we need to deploy Logging Agent at every node. This can be done with the help of DaemonSet.

A DaemonSet ensures that all (or some) Nodes run a copy of a Pod. As nodes are added to the cluster, Pods are added to them. As nodes are removed from the cluster, those Pods are garbage collected.

Setup

You can use the official Fluentd Docker image or create your own.

Step 1: Clone this Repository. This contains all the required files for this project.

File Structure for this repository:

# https://gitlab.com/gursimran.singh1/cluster-loggingcluster-logging
|
├── docker
│   ├── Dockerfile
│   ├── entrypoint.sh
│   └── fluent.conf
|
├── Kubernetes
│   ├── fluentd-configmap.yaml
│   └── fluentd-demonset.yaml
|
├── .gitignore
└── README.md

Step 2: Create a Docker image for Fluentd with all the required plugins.

Docker folder inside the repository contains 3 files:

Dockerfile: Creates a Docker image for Fluentd
fluent.conf: Default configuration file used by Fluentd
entrypoint.sh: Configures and sets the Fluentd container at the runtime.

Build the Docker image and push that image to the Docker registry:

# Go inside docker folder
cd docker# Build docker image
docker build -t <SERVER>/<IMAGE-NAME>:<TAG> .# Push docker image
docker push <SERVER>/<IMAGE-NAME>:<TAG>

Deployment

Configurations will be created as a configmap, which will be mounted as a file in the Fluentd container.

The Fluentd application will be deployed as a Demonset so that it can run on every node.

Kubernetes folder inside the repository contains 2 files:

fluentd-configmap.yaml : Creates configmap of Fluentd configuration.
fluentd-demonset.yaml : Deploys Fluentd as Demonset.

Change the highlighted values according to your project:

## Kubernetes/fluentd-configmap.yamlapiVersion: v1
kind: ConfigMap
metadata:
  name: fluentdconf
data:
  fluentd.conf: |
    <source>
      @type tail
      path /var/log/containers/*.log
      pos_file /var/log/fluentd-containers.log.pos
      time_format %Y-%m-%dT%H:%M:%S.%NZ
      tag kubernetes.*
      format json
      read_from_head true
    </source>
    <match kubernetes.var.log.containers.**kube-system**.log>
      @type null
    </match>
    <match kubernetes.**>
      @type copy
      <store>
        @type gcs
        project {PROJECT_NAME}
        bucket {BUCKET_NAME}
        object_key_format %{path}%{time_slice}_%{index}.%{file_extension}
        path logs/%Y-%m-%d/
        store_as json<buffer tag,time>
          @type file
          path /fluentd/logs/temp
          flush_mode immediate
          timekey 5
            time_key_wait 1
          timekey_use_utc true
        </buffer>
        <format>
          @type json
        </format>
      </store>
      <store>
        @type std## Kubernetes/fluentd-demonset.yamlapiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd-logger
  labels:
    k8s-app: fluentd-logging
spec:
  selector:
    matchLabels:
      name: fluentd-logger
  template:
    metadata:
      labels:
        name: fluentd-logger
    spec:
      tolerations:
      # this toleration is to have the daemonset runnable on master nodes
      # remove it if your masters can't run pods
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: fluentd-logger
        image: gursimran14/fluentd:latest
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: fluentconfig
          mountPath: /fluentd/etc
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
      terminationGracePeriodSeconds: 30
      volumes:
      - name: fluentconfig
        configMap:
          name: fluentdconf
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers

Deploy the configmap and demonset in Kubernetes.

# Go inside Kubernetes folder
cd Kubernetes# Apply configmap
kubectl apply -f fluentd-configmap.yaml# Apply Demonset
kubectl apply -f fluentd-demonset.yaml

It will start writing to your GCS Bucket as soon as the logs are generated.

Final Thoughts

This method will only work if your applications create logs in the form of standard output and standard error streams. For Cloud Storage Authentication, you can use OAuth or Service Account.

We will continue to explore approaches for log collection in Kubernetes. Keep up with this blog post series and see which approach makes the most sense for you.

Ready to Accelerate Your Digital Transformation?

At Hashmap, we work with our clients to build better together.

If you’d like additional assistance in this area, Hashmap offers a range of enablement workshops and consulting service packages as part of our consulting service offerings, and would be glad to work through your specifics.

Feel free to share on other channels, and be sure and keep up with all new content from Hashmap here. To listen in on a casual conversation about all things data engineering and the cloud, check out Hashmap’s podcast Hashmap on Tap as well on Spotify, Apple, Google, and other popular streaming apps.

Hashmap on Tap | Hashmap Podcast

A rotating cast of Hashmap hosts and special guests explore different technologies from diverse perspectives while enjoying a drink of choice.

www.hashmapinc.com

Application Logging Simplified in Kubernetes: Part 1

Node-Level Logging

Node-Level Logging

Architecture

Setup

Deployment

Final Thoughts

Ready to Accelerate Your Digital Transformation?

Hashmap on Tap | Hashmap Podcast

A rotating cast of Hashmap hosts and special guests explore different technologies from diverse perspectives while enjoying a drink of choice.

Other Tools and Content You Might Like

Building ML Pipelines

Dockerizing Your Code

Building ML Pipelines

Kubernetes with Argo for the Win

Hashmap Megabytes | Bite-Size Video Series

Hashmap Megabytes is a weekly video series in which mega cloud ideas are explained in bite-size portions.

Snowflake Utilities & Accelerators | Do more with Snowflake | Hashmap

Try out all the Snowflake utilities that Hashmap has available and do more with Snowflake: Snowflake Inspector…

Written by Gursimran Singh