Kubernetes Logging Tutorial For Beginners

Bibin Wilson
DevOps Learners
Published in
8 min readDec 14, 2021

Note: This post was originally published on https://devopscube.com/kubernetes-logging-tutorial/

In this kubernetes logging tutorial, you will learn the key concepts and workflows involved in Kubernetes cluster logging.

When it comes to Kubernetes production debugging, logging plays a crucial role. It helps you understand what is happening, what went wrong, and even what could go wrong. As a DevOps engineer, you should clearly understand Kubernetes logging to troubleshoot cluster and application issues.

How does Kubernetes Logging Work?

In Kubernetes, most of the components run as containers. In the kubernetes construct, an application pod can contain multiple containers. Most of the Kubernetes cluster components like api-server, kube-scheduler, Etcd, kube proxy, etc,. run as containers. However, the kubelet component runs as a native systemd service.

In this section, we will look at how logging works for Kubernetes pods. It could be an application pod or a Kubernetes component pod. We will also look at how kubelet systemd logs are managed.

Generally, any pods we deploy on Kubernetes write the logs to streams as opposed to writing logs to a dedicated log file. However, the streamed to stdout and stderr from each container is stored in the file system in JSON format. The underlying container engine does this work, and it is designed to handle logging. For example, the Docker container engine.

Note: All the kubernetes cluster component logs are processed just like any other container log.

Kubelet runs on all the nodes to ensure the containers on the node are healthy and running. It is also responsible for running the static pods as well. If kubelet runs as a systemd service, it writes logs to journald.

Also, If the container doesn’t stream the logs to STDOUT and STDERR, you will not get the logs using the “kubectl logs” command because kubelet won’t have access to the log files.

Kubernetes Pod Log Location

You can find the kubernetes pod logs in the following directories of every worker node.

  1. /var/log/containers: All the container logs are present in a single location.
  2. /var/log/pods/: Under this location, the container logs are organized in separate pod folders. /var/log/pods/<namespace>_<pod_name>_<pod_id>/<container_name>/. Each pod folder contains the individual container folder and its respective log file. Each folder has a naming scheme that follows

Also, if your underlying container engineer is docker, you will find the logs in /var/lib/docker/containers folder.

If you log in to any Kubernetes worker node and go to /var/log/containers the directory, you will find a log file for each container running on that node. The log file naming scheme follows /var/log/pods/<namespace>_<pod_name>_<pod_id>/<container_name>/. An example is shown in the image below.

Also, these log files are controlled by Kubelet, so when you run the kubectl logs command, kubelet shows these logs in the terminal.

Kubelet Logs

In the case of Kubelet, you can access the logs from individual worker nodes using journalctl. For example, use the following command to check the Kubelet logs.

journalctl -u kubelet journalctl -u kubelet -o cat

If the Kubelet is running without systemd, you can find the Kubelet logs in the /var/log directory.

Kubernetes Container Log Format

As explained earlier, all the log data is stored in JSON format. Therefore, if you open any of the log files, you will find three keys for each log entry.

  1. Log — The actual log data
  2. stream — The stream which log was written
  3. time — Timetamp

If you open any log file, you will see the information mentioned above in JSON format. For example, the following image shows the contents of the Nginx log file. The output is prettified using jq.

Types for Kubernetes logs

When it comes to Kubernetes, the following are different types of logs.

  1. Application logs: Logs from user deployed applications. Application logs help in understanding what is happening inside the application.
  2. Kubernetes Cluster components: Logs from api-server, kube-scheduler, etcd, kube-proxy, etc. These logs help you troubleshoot Kubernetes cluster issues.
  3. Kubernetes Audit logs: All logs related to API activity recorded by the API server. Primarily used for investigating suspicious API activity.

Kubernetes Logging Architecture

If we take the Kubernetes cluster as a whole, we would need to centralize the logs. There is no default Kubernetes functionality to centralize the logs. You need to set up a centralized logging backend (Eg: Elasticsearch) and send all the logs to the logging backend. The following image depicts a high-level kubernetes logging architecture.

Let’s understand the three key components of logging.

  1. Logging Agent: A log agent that could run as a daemonset in all the Kubernetes nodes that steams the logs continuously to the centralized logging backend. The logging agent could run as a sidecar container as well. For example, Fluentd.
  2. Logging Backend: A centralized system that is capable of storing, searching, and analyzing log data. A classic example is Elasticsearch.
  3. Log Visualization: A tool to visualize log data in the form of dashboards. For example, Kibana.

Kubernetes Logging Patterns

This section will look at some of the Kubernetes logging patterns to stream logs to a logging backend. There are the three key Kubernetes cluster logging patterns

  1. Node level logging agent
  2. Streaming sidecar container
  3. Sidecar logging agent

Let’s look at each method in detail.

1. Node Level Logging Agent

source:Kubernetes.io

In this method, a node-level login agent (Eg: Fluentd) reads the log file created using container STDOUT and STDERR streams and then sends it to a logging backend like Elasticsearch. This is a commonly used logging pattern and works pretty well without any overhead.

Even the 12factor apps methodology suggests streaming logs to STDOUT.

In managed Kubernetes services like GKE or EKS, the backend would be AWS cloudwatch and Google Stackdriver.

2. Streaming sidecar container

source:Kubernetes.io

This streaming sidecar method is useful when the application cannot write logs to the STDOUT and STDERR streams directly.

So, the application container writes all the logs to a file within the container. Then a sidecar container reads from that log file and streams it to STDOUT and STDERR. Rest is the same as the first method.

3. Sidecar Logging Agent

source:Kubernetes.io

In this method, the logs don’t get streamed to STDOUT and STDERR. Instead, a sidecar container with a logging agent would be running along with the application container. Then, the logging agent would directly stream the logs to the logging backend.

There are two downsides to this approach.

  1. Running a logging agent as a sidecar is resource-intensive.
  2. You won’t get the logs using kubectl logs command as Kubelet will not handle the logs.

Kubernetes Logging Tools

The most commonly used open-source logging stack for Kubernetes is EFK ( Elasticsearch, Flunentd/Fluent-but, and Kibana).

  1. Elasticsearch — Log aggregator
  2. Flunetd/Fluentbit — Logging agent (Fluentbit is the light-weight agent designed for container workloads)
  3. Kibana — Log Visualization and dashboarding tool

When it comes to Managed Kubernetes services like Google GKE, AWS EKS, and Azure AKS, it comes integrated with the cloud-specific centralized logging. So when you deploy a managed kubernetes cluster, you get options to enable log monitoring in the respective logging service. For example,

  1. AWS EKS uses Cloud
  2. Google GKE uses Stackdriver monitoring
  3. Azure AKS uses Azure Monitor

Also, organizations might use enterprise logging solutions like Splunk. In this case, the logs get forwarded to Splunk monitoring and to comply with log retention norms of the organization. Following are some of the enterprise logging solutions.

I know a use case where we pushed all logs from GKE to Splunk using pub-sub for longer retention and used Stackdriver with less log retention for real-time querying due to cost constraints.

EFK For Kubenretes Logging

One of the best open-source logging setups for Kubernetes is the EFK stack. It contains Elasticsearch, Fluentd, and Kibana.

We have a detailed blog on EFK setup on Kubernetes. It covers the step-by-step guide to set up the whole stack.

Check out the EFK Kubernetes setup guide.

Kubernetes Logging FAQ’s

Let’s look at some of the frequently asked Kubernetes logging questions.

Where are Kubernetes pod logs stored?

If a pod uses STDOUT and STDERR streams, the log gets stored in two locations. /var/log/containers and /var/log/pods/. However, if the pod uses a sidecar log agent pattern, the log gets stored within the pod.

How do I find Kubernetes audit logs?

You can find the audit logs from the path specified in the –audit-log-path flag during the cluster setup. If audit policy is enabled and this flag is not set, then the api-server streams the audit logs to STDOUT. In managed kubernetes services, you get the option to enable audit logs and you can check the logs in the respectively managed logging backend like Cloudwatch, Stackdriver, etc.

How does Kubernetes rotate logs?

Kubernetes is not responsible for rotating container logs. The underlying container runtime can handle the log rotation using specific parameters. For example, Docker runtime has a configuration file daemon.json to configure log rotation options.

How do I view Kubernetes logs in real-time?

You can use kubectl utility to view logs and events in real-time.

Conclusion

In this guide, we have looked at the essential concepts related to Kubernetes logging.

Also, in my DevOps engineer guide, I have explained why learning about logging is very important for a DevOps engineer.

In the next series of articles, we will look at setting up a Kubernetes logging infrastructure using open source tools.

Originally published at https://devopscube.com.

--

--