Kubernetes logs to AWS Cloudwatch with fluentd

Published in

Attest Product & Technology

5 min readSep 12, 2018

EKS has just been released in eu-west-1 (Ireland), but while Kubernetes is a mature project; there are still some pieces missing from EKS that you might expect, especially if you’re migrating from ECS; Amazon’s previous k8s competitor.

As Kubernetes is container agnostic, and adheres to the Open Containers Initiative; there are some docker specific options that you might have exploited previously when running containers in AWS. Notably the awslogs docker logs driver, that can be passed with --log-driver awslogs. This driver will ship the container logs to cloudwatch for you.

Kubernetes doesn’t expose the --log-driver option, as it’s a docker container runtime-specific flag. So to replicate what this might have done for you we need a little DIY.

When running containers on ECS, awslogs organises log messages into “log groups” and “log streams”. awslogs lets you specify the name of the group and stream for a container.

--log-driver awslogs 
--log-opt awslogs-group=user-service 
--log-opt awslogs-stream-prefix=user-service

Using these options when starting a docker container the logs in cloudwatch would be organised as:

user-service (log group)
├── user-service-a4782574 (log stream)
├── …
├── user-service-f59185d5
└── user-service-1f918b24

This gives you an overview of all logs in the group, and allows you to drill down into each log stream to look at the logs for a single container. Translating this into k8s language, we have:

user-service (container name)
├── user-service-a4782574 (pod name)
├── …
├── user-service-f59185d5
└── user-service-1f918b24

There are a number of different logging architectures suggested by the k8s docs; but the one that we describe below is the one that does not require any additional setup once it’s first deployed.

We will run fluentd as a daemonset that will automatically create the log groups and streams required. With a format mirroring what you could achieve on ECS using docker logging options.

Aside on fluentd

Fluentd is a unified logging layer that can collect, process and forward logs. It’s got hundreds of plugins and is configured using simple syntax that revolves around 3 stages:
1. source where are the log messages coming from
2. filter how should we transform the messages
3. match where should we send the logs

Example logging architecture:

Fluentd config

Source:

K8s uses the json logging driver for docker which writes logs to a file on the host. K8s symlinks these logs to a single location irrelevant of container runtime. We will ingest the logs from this symlinked location.

We register the containers logging path as an input source with fluentd:

<source>
 @type tail
 @id in_tail_container_logs
 path /var/log/containers/*.log
 pos_file /var/log/fluentd-containers.log.pos
 tag kubernetes.*
 read_from_head true
 <parse>
 @type json
 time_format %Y-%m-%dT%H:%M:%S.%NZ
 </parse>
</source>

We use the tail input source to tail the log files; this assumes that there’s 1 log message per line.
Each line of the file will get turned into a single fluentd record, this record gets given a tag by fluentd. This tag can be used to match only certain records later in the fluentd pipeline.

For records coming from docker logs, their origial tag will be something like: var.log.containers.${log_file_name}.log.
Using the tag kubernetes.* option prefixes the existing tag, resulting in: kubernetes.var.log.containers.${log_file_name}.log.

Filter:

The ingested log message coming from docker will look like:

{
 “log”: “2014/09/25 21:15:03 I am sample log message\n”,
 “stream”: “stderr”,
 “time”: “2018–09–12T10:25:41.499185026Z”
}

We use the kuberenetes_metadata filter to decorate the fluentd record with k8s data.

<filter kubernetes.**>
 @type kubernetes_metadata
</filter>

After this filter is applied our fluentd record looks more like:

{
 “log”: “2014/09/25 21:15:03 I am sample log message\n”,
 “stream”: “stderr”,
 “time”: “2018–09–12T10:25:41.499185026Z”
 “kubernetes”: {
     “namespace_name”: “default”,
     “pod_name”: “user-service-0d25baf”,
     “container_name”: “user-service”
 },
 “docker”: {
     “container_id”: “4667aa8e36ff8c9ed9d28a227e93a6718eb6498ead5807d4c4d515338041baee”
 }
}

We now have the k8s container and pod name inside the record and can use this data to ship logs to the correct cloudwatch group / stream. But first we apply 2 more filters to the record:

<filter kubernetes.**>
 @type record_transformer
 enable_ruby true
 <record>
 service ${ENV[“ENV”]}-${record[“kubernetes”][“container_name”]}
 pod ${record[“kubernetes”][“pod_name”]}
 </record>
</filter><filter kubernetes.**>
 @type parser
 key_name log
 reserve_data true
 remove_key_name_field true
 emit_invalid_record_to_error false
 <parse>
 @type json
 </parse>
</filter>

The first filter record_transformer allows changing the content of the record; we use it to create to new top level fields (that we will use later); service and pod.

We enable_ruby true to ensure that we can access both the ENV that fluentd is running in, and the nested record content: $record[“foo”][“bar”] is not possible without enabling ruby.

The second filter applied is a parser this allows changing the format of the content. We apply the json parser to all the records; this is because our services log in json, which is inside the docker json. It converts a record that look like:

{
 “log”: “{\”msg\”:\”hello world\”,\”trace_id\”:\”9b35a27a6becc962\”,\”field\”:1}”,
 “stream”: “stderr”,
 “time”: “2018–09–12T10:25:41.499185026Z”
}

.. and extracts the nested json from the log field creating a record that looks like:

{
 “stream”: “stderr”,
 “time”: “2018–09–12T10:25:41.499185026Z”,
 “msg”: “hello world”,
 “trace_id”: “9b35a27a6becc962”,
 “field”: 1
}

Note: we set emit_invalid_record_to_error false because not all of the services log json, this ensures that errors are not created if the json parsing fails.

Output

Finally we need to ship the logs to cloudwatch, we have all the information that we need in the record, and use the fluent-plugin-cloudwatch-logs plugin.

<match kubernetes.**>
 @type cloudwatch_logs
 log_group_name_key service
 log_stream_name_key pod
 remove_log_group_name_key true
 remove_log_stream_name_key true
 auto_create_stream true
</match>

We use the service and pod record keys that we created in the filter steps, and remove them from the final record.
We also need to make sure that fluentd has the correct IAM permissions for the creating log groups and streams so that the auto_create_stream true option works.

Wrap up

This allows us to use the same logging infrastructure for all of our applications; if they are running on EC2 instances, ECS or in EKS. And each of the services / pods deployed doesn’t need to concern itself with shipping logs, it’s automagically supported.

The k8s resources are beyond the scope of this post; but combining the config here with the daemonsets available at the links below will get you 95% of the way. Remember the IAM permissions.

GitHub — fluent/fluentd-kubernetes-daemonset: Fluentd daemonset for Kubernetes and it Docker image
GitHub — fluent-plugins-nursery/fluent-plugin-cloudwatch-logs: CloudWatch Logs Plugin for Fluentd