Application Logging Simplified in Kubernetes: Part 1

Node-Level Logging

Gursimran Singh
Hashmap, an NTT DATA Company
5 min readDec 10, 2020

--

Logging is one of the most critical parts of all production-grade applications and services.

With the rapid growth of the micro-services architecture, we have many services running in environments that add to the complexity of the collection and unification of the logs.

In this series of blog posts, we will discuss how to collect and store application logs in Kubernetes in a simple but effective way. This is a 3 part series in which we will explore different methods for logging under Kubernetes.

3 approaches can be used for log collection in Kubernetes:

  1. Node-Level Logging
  2. Sidecar Container
  3. Direct Logging

This first blog post will focus on Node-Level Logging.

Node-Level Logging

The easiest and most embraced logging method for containerized applications is to write to the standard output and standard error streams.

Kubernetes collect all the Logs generated by the containers in the form of std-out and std-err.

Kubernetes stores logs for each container at a node level, which will then be collected by a logging agent and stored in a Logging Back-end for persistence.

Architecture

https://kubernetes.io/docs/concepts/cluster-administration/logging/

In this architecture, we will collect container logs and store those logs in Google Cloud Storage (GCS) with Fluentd as a Logging Agent.

Kubernetes cluster generally consists of more than 1 node, so we need to deploy Logging Agent at every node. This can be done with the help of DaemonSet.

A DaemonSet ensures that all (or some) Nodes run a copy of a Pod. As nodes are added to the cluster, Pods are added to them. As nodes are removed from the cluster, those Pods are garbage collected.

Setup

You can use the official Fluentd Docker image or create your own.

Step 1: Clone this Repository. This contains all the required files for this project.

File Structure for this repository:

# https://gitlab.com/gursimran.singh1/cluster-loggingcluster-logging
|
├── docker
│ ├── Dockerfile
│ ├── entrypoint.sh
│ └── fluent.conf
|
├── Kubernetes
│ ├── fluentd-configmap.yaml
│ └── fluentd-demonset.yaml
|
├── .gitignore
└── README.md

Step 2: Create a Docker image for Fluentd with all the required plugins.

Docker folder inside the repository contains 3 files:

  • Dockerfile: Creates a Docker image for Fluentd
  • fluent.conf: Default configuration file used by Fluentd
  • entrypoint.sh: Configures and sets the Fluentd container at the runtime.

Build the Docker image and push that image to the Docker registry:

# Go inside docker folder
cd docker
# Build docker image
docker build -t <SERVER>/<IMAGE-NAME>:<TAG> .
# Push docker image
docker push <SERVER>/<IMAGE-NAME>:<TAG>

Deployment

Configurations will be created as a configmap, which will be mounted as a file in the Fluentd container.

The Fluentd application will be deployed as a Demonset so that it can run on every node.

Kubernetes folder inside the repository contains 2 files:

  • fluentd-configmap.yaml : Creates configmap of Fluentd configuration.
  • fluentd-demonset.yaml : Deploys Fluentd as Demonset.

Change the highlighted values according to your project:

## Kubernetes/fluentd-configmap.yamlapiVersion: v1
kind: ConfigMap
metadata:
name: fluentdconf
data:
fluentd.conf: |
<source>
@type tail
path /var/log/containers/*.log
pos_file /var/log/fluentd-containers.log.pos
time_format %Y-%m-%dT%H:%M:%S.%NZ
tag kubernetes.*
format json
read_from_head true
</source>
<match kubernetes.var.log.containers.**kube-system**.log>
@type null
</match>
<match kubernetes.**>
@type copy
<store>
@type gcs
project {PROJECT_NAME}
bucket {BUCKET_NAME}
object_key_format %{path}%{time_slice}_%{index}.%{file_extension}
path logs/%Y-%m-%d/
store_as json
<buffer tag,time>
@type file
path /fluentd/logs/temp
flush_mode immediate
timekey 5
time_key_wait 1
timekey_use_utc true
</buffer>
<format>
@type json
</format>
</store>
<store>
@type std
## Kubernetes/fluentd-demonset.yamlapiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd-logger
labels:
k8s-app: fluentd-logging
spec:
selector:
matchLabels:
name: fluentd-logger
template:
metadata:
labels:
name: fluentd-logger
spec:
tolerations:
# this toleration is to have the daemonset runnable on master nodes
# remove it if your masters can't run pods
- key: node-role.kubernetes.io/master
effect: NoSchedule
containers:
- name: fluentd-logger
image: gursimran14/fluentd:latest
resources:
limits:
memory: 200Mi
requests:
cpu: 100m
memory: 200Mi
volumeMounts:
- name: fluentconfig
mountPath: /fluentd/etc
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
terminationGracePeriodSeconds: 30
volumes:
- name: fluentconfig
configMap:
name: fluentdconf
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers

Deploy the configmap and demonset in Kubernetes.

# Go inside Kubernetes folder
cd Kubernetes
# Apply configmap
kubectl apply -f fluentd-configmap.yaml
# Apply Demonset
kubectl apply -f fluentd-demonset.yaml

It will start writing to your GCS Bucket as soon as the logs are generated.

Final Thoughts

This method will only work if your applications create logs in the form of standard output and standard error streams. For Cloud Storage Authentication, you can use OAuth or Service Account.

We will continue to explore approaches for log collection in Kubernetes. Keep up with this blog post series and see which approach makes the most sense for you.

Ready to Accelerate Your Digital Transformation?

At Hashmap, we work with our clients to build better together.

If you’d like additional assistance in this area, Hashmap offers a range of enablement workshops and consulting service packages as part of our consulting service offerings, and would be glad to work through your specifics.

Feel free to share on other channels, and be sure and keep up with all new content from Hashmap here. To listen in on a casual conversation about all things data engineering and the cloud, check out Hashmap’s podcast Hashmap on Tap as well on Spotify, Apple, Google, and other popular streaming apps.

Other Tools and Content You Might Like

Gursimran Singh is a Cloud and Data Engineer with Hashmap providing Data, Cloud, IoT, and AI/ML solutions and consulting expertise across industries with a group of innovative technologists and domain experts accelerating high-value business outcomes for our customers. Connect with him on LinkedIn.

--

--