Application Logs from Kubernetes to S3 and Elasticsearch using Fluentd

Problem Statement :

Application logging is an important part of software development lifecycle, deploying a solution for log management in Kubernetes is simple when log’s are written to stdout ( Best practise ). However when an monolithic stack is converted to container and pushed to Kubernetes, there are many traditional log paths that should be used to collect and aggregate logs.

I have tried many helm chart’s for logging kubernetes, there are many articles and repo’s for collelcting logs from K8 stack but there is no easy way to pull it from within a container.

Scenario

Multiple micro services running in Kubernetes cluster deployed in AWS, Application logs are being stored on different paths inside docker containers ( not stdout ). These logs are needed in S3 for log analytics and long term retention and in Elasticsearch for real time log aggregation and visualisation.

Solution :

Easy to deploy with customisations. No change needed in application logging. Out of box log tool set’s deployed along with application containers but isolated and not impacting performance of app workload. Globally defined log functions which can be plugged into any new application that is being deployed. Secured aggregation pipeline.

Stack Used :

Kubernetes Side Car containers, Fluentd, AWS Elasticsearch, S3 and Obviously Dockers

Architecture :

Source code and reference deployment manifests can be found here

If you know what you are doing after looking at the Repo ! don’t scroll down, it’s just me explaining steps for deploying this solution for newbies.

Implementation :

I’m going to assume that you have a Kubernetes cluster at hand for this walkthrough.

Note : Good news is I do not have any pv,pvc’s,nfs included in this architecture , We would be reading logs on the fly and pushing them indexed data out side kubernetes cluster, I did not use any state full set’s here.

➜  Documents git clone  https://github.com/prithviraju/kubernetes-fluentd-logging.git
Cloning into 'kubernetes-fluentd-logging'...
remote: Counting objects: 14, done.
remote: Compressing objects: 100% (11/11), done.
remote: Total 14 (delta 0), reused 11 (delta 0), pack-reused 0
Unpacking objects: 100% (14/14), done.
➜  kubernetes-fluentd-logging git:(master) tree
.
├── Dockerfile
├── LICENSE
├── README.md
├── fluent.conf
├── kubernetes.conf
├── sample_deployment.yaml
└── systemd.conf

There are no big changes needed in the configuration files. You should be looking at fluent.conf file to add your environment specify variables.

Update fluent.conf by replacing items marker with ###

@include systemd.conf
@include kubernetes.conf
## App 1 config 
<source>
@type tail
@id XXXX ### Uniq ID
path XXXX ### Log path
pos_file XXXX ### pos file
tag XXXXX ### App tag
read_from_head true
<parse>
@type regexp
keep_time_key true
expression (?<logtime>.*?) - (?<names>[^ ]*) - (?<log_level>[^ ]*) - (?<message>.*)
time_format %Y/%m/%d %H:%M:%S,%6N%z
</parse>
</source>
## App 2 config
<source>
@type tail
@id XXXXX ## Uniq ID
path XXXX ### Log path
pos_file XXXX ### pos file
tag XXXXX ## App tag
read_from_head true
<parse>
@type regexp
keep_time_key true
expression (?<logtime>.*?) - (?<names>[^ ]*) - (?<log_level>[^ ]*) - (?<message>.*)
time_format %Y/%m/%d %H:%M:%S,%6N%z
</parse>
</source>
# Store Data in Elasticsearch and S3
<match **>
@type copy
deep_copy true
<store>
@type s3
aws_key_id XXXXXXXXXXXXXXXXX # Access Key
aws_sec_key XXXXXXXXXXXXXXXXXXXXXX # Secret Key
s3_bucket XXXXXX # S3 Bucket
s3_region XXXXX # Bucket Region
path "#{ENV['S3_LOGS_BUCKET_PREFIX']}"
buffer_path /var/log/fluent/s3
s3_object_key_format %{path}%{time_slice}/cluster-log-%{index}.%{file_extension}
time_slice_format %Y%m%d-%H
time_slice_wait 10m
flush_interval 60s
buffer_chunk_limit 256m
</store>

<store>
@type elasticsearch
@id out_es
log_level info
include_tag_key true
host "#{ENV['FLUENT_ELASTICSEARCH_HOST']}"
port "#{ENV['FLUENT_ELASTICSEARCH_PORT']}"
scheme "#{ENV['FLUENT_ELASTICSEARCH_SCHEME'] || 'https'}"
ssl_verify "#{ENV['FLUENT_ELASTICSEARCH_SSL_VERIFY'] || 'true'}"
reload_connections "#{ENV['FLUENT_ELASTICSEARCH_RELOAD_CONNECTIONS'] || 'true'}"
logstash_prefix "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_PREFIX'] || 'applogs'}"
logstash_format true
<buffer>
flush_thread_count 8
flush_interval 5s
chunk_limit_size 2M
queue_limit_length 32
retry_max_interval 30
retry_forever true
</buffer>
</store>
</match>

Take a moment in understanding this configuration file.

We are ready to build our image and use it in our Kubernetes deployment manifest.

I would be using ECR to store my images.

➜  kubernetes-fluentd-logging git:(master) docker build  -t 123456789.dkr.ecr.ap-south-1.amazonaws.com/app-prod:logging-v1 .
➜  kubernetes-fluentd-logging git:(master) docker push 123456789.dkr.ecr.ap-south-1.amazonaws.com/app-prod:logging-v1

Updating your Deployment file

Look sample_deployment.yam lwhich is an extacr from one of my production deployment

In this example I’m interested in logging two log files from my container /var/log/access/app.log and /var/log/error/error.log.

Take a minute to understand below file.

---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
annotations:
labels:
app: app1
name: app1
namespace: app1
spec:
replicas: 3
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
type: RollingUpdate
template:
metadata:
labels:
app: app1
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- podAffinityTerm:
topologyKey: kubernetes.io/hostname
weight: 100
containers:
- name: app1
image: "IMAGE:VERSION"
env:
- name: NODE_APP
value: ""
imagePullPolicy: IfNotPresent
livenessProbe:
httpGet:
path: /health-check
port: 80
initialDelaySeconds: 30
periodSeconds: 15
timeoutSeconds: 5
readinessProbe:
httpGet:
path: /health-check
port: 80
initialDelaySeconds: 30
periodSeconds: 15
timeoutSeconds: 5
volumeMounts:
- name: applog-error
mountPath: /var/log/access
- name: applog-error
mountPath: /var/log/error
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
- name: fluentd
image: IMAGE:VERSION_LOG
env:
- name: INSTANCE
value: "app1"
- name: FLUENT_ELASTICSEARCH_HOST
value: ""
- name: FLUENT_ELASTICSEARCH_PORT
value: "443"
- name: FLUENT_ELASTICSEARCH_SCHEME
value: "https"
- name: S3_LOGS_BUCKET_PREFIX
value: "app-logs/app1/"
- name: FLUENT_ELASTICSEARCH_LOGSTASH_PREFIX
value: "app1"
- name: FLUENT_ELASTICSEARCH_SSL_VERIFY
value: "true"
- name: FLUENT_ELASTICSEARCH_USER
value: ""
- name: FLUENT_ELASTICSEARCH_PASSWORD
value: ""
- name: FLUENT_ELASTICSEARCH_RELOAD_CONNECTIONS
value: "false"
imagePullPolicy: Always
resources:
limits:
memory: 100Mi
requests:
cpu: 50m
memory: 100Mi
volumeMounts:
- name: applog-error
mountPath: /log/access
- name: applog-error
mountPath: /log/error
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- name: applog-access
emptyDir: {}
- name: applog-error
emptyDir: {}

Here /var/log/access from my app container is /log/access on my side car and /var/log/error is /log/error.

Briefly we have two volumes applog-access and applog-error mounted on both the containers.

The volume attached to this pod is a emptyDir, which means the life of the volume attached to both the containers are only until the pod is in running state.

Make sure your fluent.conf is looking at these paths for logs in path section of the file.

You will not get this right on the first deployment, at-least I did not! below kubectl command can be used to tail logs from the pod.

kubectl get pods -n $NAMESPACE
kubectl logs $POD_ID -n $NAMESPACE -c $APP/FLUENTD -f

If no error’s you should be seeing logs in S3 and Elasticsearch

S3 directory strcuture. 
├── S3-BUCKET
|-- APP1
|-- 20180613-19
|-- cluster-log-0.gz
|-- cluster-log-1.gz
|-- 20180613-20
|-- APP2

You can access Kibana to get realtime log visualisation.

Happy Logging !

Like what you read? Give Prithvi Raju Alluri a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.