OpenTelemetry in Kubernetes: Deploying your Collector and Metrics Backend

Tathagata Paul
3 min readMar 31, 2022

OpenTelemetry is a great way to instrument your applications to provide metrics in a vendor-agnostic way to any observability backend. But lots of people face issues while deploying it on Kubernetes. For me, I had the knowledge of how Kubernetes works, but I had trouble deploying the collector or at times instrumenting my application. The resources on the internet are a bit scattered and it requires a lot of time to go through them. There is a lack of resources that can show you a concrete implementation of OpenTelemetry in Kubernetes from start to finish (or some of them are very cleverly hidden). So I decided to write this blog to demonstrate a very simple implementation of how to deploy a collector to collect metrics and then export the data to various backends for observability. In another blog, I will show how an application in GoLang can be instrumented to expose metrics.

Deploying the Prometheus Backend and OpenTelemetry Collector:

Our objective, for now, will be to deploy a single OpenTelemetry Collector instance and export telemetry data from the instance to our Prometheus backend (we will worry about traces later…). Let’s start with the Prometheus instance first…

Prometheus:

  1. Create a namespace:
kubectl create namespace sample

2. Create a cluster role for Prometheus for the “sample” namespace

3. Config map for defining the Prometheus configs:

4. Deploy the Prometheus pod:

5. And finally the service for the deployment (so that we can later see the Prometheus data on our browser)

A few things to notice while deploying our Prometheus pods and service:

  • The scrape config of the Prometheus instance is [‘opentelemetrycollector.sample.svc.cluster.local:9090’].This means that prometheus will scrape the “opentelemetrycollector” service in our “sample” namespace on port: 9090.

Lets create our collector now:

  1. First the config map which will allow us to configure our OpenTelemetry Collector

Here we can notice a few things:

  • The receivers section defines the way in which we want to receive our metrics. There are lots of receivers. I decided to use OTLP (OpenTelemetry Protocol) as it allows collection of both metrics and traces.
  • The processors process the data which goes through the collector and can be used for changing or aggregating the data in some way. This is out of the scope for a simple deployment like ours. I have just added a batch processor which sends batches of data every 5s for demonstration of how it can be configured.
  • The exporters can export the data in many formats of which we are interested in the prometheus format, for our data to be collected by prometheus. Here we export the data on our port: 9090 as configured. Also, we set the log level to debug to catch any potential errors (we can check them on our opentelemetrycollector pod logs).
  • The service section finally configures all that was in the receivers, processors, and exporters and enables them. If a component is configured in the above sections but is not defined in the service section, it is not enabled. This also defined individual pipelines for traces and metrics and their receivers, exporters, and processors individually

2. Okay lots of theory, let’s move on to the deployment.

3. And finally, the service which will allow our Prometheus pod to scrape the metrics:

Notice the otlp-grpc port: 4317. The receiver receives metrics on this port and we will connect with grpc on this port with our application.

Port-forward the Prometheus-service to view the metrics on localhost:9090 (when you actually send some metrics through the collector :) )

kubectl port-forward svc/prometheus-service 9090:9090 -n sample

This shows us how to deploy a simple OpenTelemetry Collector instance on our Kubernetes Cluster along with a Prometheus Backend to connect to it. But, you might be wondering what happens if you need to change some collector configurations? That will require us to redeploy the Collector for the ConfigMap and the Deployment to be in sync… right? What happens if we have multiple nodes? That’s why there is an OpenTelemetry Operator: https://github.com/open-telemetry/opentelemetry-operator which allows us to easily deploy the collector in various configs including the likes of deploying the collectors as Sidecars, Daemonsets, Statefulsets, and combination of them. This allows us to scale our deployments in larger projects.

Few resources which I found helpful:

  1. https://opentelemetry.io/docs/collector/configuration/#processors
  2. https://medium.com/opentelemetry/deploying-the-opentelemetry-collector-on-kubernetes-2256eca569c9
  3. https://signoz.io/blog/opentelemetry-kubernetes/
  4. https://www.youtube.com/watch?v=L_gjG4BjvSE&list=PLwM0dm4_8NSaoJCV0BykBH0ccj6HZIsJe&index=9
  5. https://www.youtube.com/watch?v=L-Ss8PtWlRA&list=PLwM0dm4_8NSaoJCV0BykBH0ccj6HZIsJe&index=7

--

--