Deploying the OpenTelemetry Collector on Kubernetes
The OpenTelemetry Collector is a binary that is typically deployed as an agent on hosts that run business applications, but more and more applications are now running on a container platform like Kubernetes.
But what does it take to run the OpenTelemetry Collector on Kubernetes?
The OpenTelemetry Collector project ships with an example file to be fed into Kubernetes, but that example file is intended as a guide, and not as a production-ready solution. The collector is so versatile that a one-size-fits-all deployment configuration is pretty much impossible to achieve.
Deployment modes
A bare metal deployment of the collector is simple to plan and execute: it’s a single binary that runs as a daemon on the host. For Kubernetes, however, we have a few options to pick from:
- Deployment, where multiple replicas can coexist, possibly on the same node
- DaemonSet, where one instance exists for each Kubernetes node
- StatefulSet, where an exact number of replicas should exist at all times, each with a predictable name (
collector-1
,collector-2
, …) - Sidecar, where one instance exists as a container alongside each pod running your business application(s), playing the role of an agent
Most commonly, you’d use a mix of regular Deployments and Sidecars: deployments are highly elastic, potentially automatically scaling up and down via Horizontal Pod Autoscale, while sidecars allow your application to offload the telemetry data to a process that is running in the same “host”. Sidecars are more likely to have custom configuration (i.e. processors) specific to the business application, whereas deployments will generally be more generic configuration.
DaemonSets would be most commonly used with single-tenant, multi-cloud deployments, in which case there is a need to ensure that telemetry data from the applications are pre-processed by a process in the same node before hitting the public internet.
StatefulSets should be used when the number of replicas for the collector instance isn’t going to change frequently and you are making use of a processor that would benefit from a stable list of hostnames, such as the load balancing exporter.
Inventory
Once you have designed the topology for your deployment, it’s time to start an inventory! For this example, we’ll use the classic Deployment+Sidecar mode.
No matter the deployment mode, you’ll probably need a configuration file for your collector. And as this has to be an actual file, we’ll create a ConfigMap for it, like this:
Following that, we can create our Deployment. The simplest solution that works is the following:
We now have a collector instance up and running, but it still can’t receive data: for that, we need a Service that exposes the OTLP port that we declared in our configuration. The following is a service definition that satisfies the requirement:
Testing it
We now have all the pieces in place for our collector, it’s time to start an application that generates traces and sends it to our collector. For this, we’ll create a deployment that contains an application that creates a trace per HTTP request that it receives. This application is packaged in a container image named quay.io/jpkroehling/generate-span-java:0.1.0, and we can use a deployment like the following:
Once we have the deployment ready, we can make a call to its /orders endpoint and it will generate a few spans for us. The easiest way to make this call for testing is by doing a port-forward, so that calls to localhost:8080/orders
will land at our application in the Kubernetes cluster:
$ kubectl port-forward deployment/myapp 8080
Forwarding from 127.0.0.1:8080 -> 8080
Let’s now tail the logs for our collector: once we make a call to our service, the loggingexporter
in the collector will make sure to record this event in the logs.
$ kubectl logs deployments/opentelemetrycollector -f
...
2021-01-22T12:59:53.561Z info service/service.go:267 Everything is ready. Begin running and processing data.
And finally, let’s generate some spans:
$ curl localhost:8080/order
Created
In the collector logs, we should now see something like this:
2021-01-22T13:26:49.320Z INFO loggingexporter/logging_exporter.go:313 TracesExporter {"#spans": 2}
2021-01-22T13:26:49.430Z INFO loggingexporter/logging_exporter.go:313 TracesExporter {"#spans": 1}
2021-01-22T13:26:50.422Z INFO loggingexporter/logging_exporter.go:313 TracesExporter {"#spans": 4}
2021-01-22T13:26:50.571Z INFO loggingexporter/logging_exporter.go:313 TracesExporter {"#spans": 1}
This confirms that the spans generated at our application has gone to the collector via the sidecar agent. In a real setup, we’d configure our collector to export our spans to a real backend, such as Jaeger or Zipkin.
Alternatives
By now, you probably realized that deploying a simple instance of OpenTelemetry Collector on Kubernetes isn’t that hard. But what about day 2 concerns, like version upgrades? Or how about keeping the Service in sync with the ConfigMap, so that all ports defined in the configuration are automatically exposed via the Service? And wouldn’t it be nice to automatically inject sidecars into your business deployments? Those are tasks that the OpenTelemetry Operator can take care of.
If you are more into Helm Charts, there’s an experimental one also available to provision the collector.
Wrapping up
The OpenTelemetry Collector is a very flexible and lightweight process, making it possible to mix and match strategies, allowing a chain of collectors to be built based on your very specific needs. Deploying individual instances of the collector in Kubernetes isn’t a complex task, although maintaining those instances might make you consider using tools like the Helm Charts or the Operator.