Jaeger embraces OpenTelemetry collector

Published in

JaegerTracing

3 min readMay 1, 2020

In this article we are going to explain Jaeger integration with the OpenTelemetry collector, describe differences between these two and have a look at Kubernetes deployment via Jaeger Operator. See Jaeger and OpenTelemetry post by Yuri Shkuro on the long-term integration roadmap.

OpenTelemetry collector is a vendor-agnostic service for receiving, processing and exporting telemetry data. In the Jaeger project we have decided to deprecate the Jaeger collector and migrate its functionality to an implementation based on OpenTelemetry collector. This has several benefits:

forward compatibility with OpenTelemetry native data model
tail-based sampling
attribute processors
standardized collection pipeline
less code to maintain

Roadmap

In the long term we would like to base the Jaeger collector, agent and ingester components on OpenTelemetry collector. These new components will be separate distributions with new image and binary names.

Our goal is to provide a smooth migration from existing binaries by supporting legacy Jaeger configuration (flags, env. vars, Jaeger configuration file). However, there will be a couple of breaking changes:

a different set of metrics exposed by these components
different health check and metric endpoints
not all flags will be supported in the new component e.g. --metrics-backed, --collector.queue-size

Follow this milestone on Github to see what is missing for the first stable release.

In the meantime, you can use collector image with thelatest tag jaegertracing/jaeger-opentelemetry-collector:latest.

Configuration & give it a try!

Upstream OpenTelemetry collector is configurable via a configuration file that is provided as a flag at application startup--config-file=config.yaml. If the configuration file is missing the collector will not start because it does not know what components (receivers, processors, exporters) should be used.

Jaeger’s build of the OpenTelemetry collector is opinionated about the configuration and it always uses a set of default components: Jaeger receiver, processors, and exporter. The exporter is one of Jaeger’s supported storage backends: Elasticsearch, Cassandra, Kafka (buffer). The configuration provided in the file is merged with the default configuration.

Jaeger specific components can be configured by the same flags that were exposed by the Jaeger collector e.g. --es.server-urls. The configuration provided in OpenTelemetry config has higher precedence.

The configuration precedence from the lowest to the highest is as follows:

Jaeger default values <Jaeger config file< environmental variables < flags (Viper’s default precedence order)
OpenTelemetry configuration file

Let’s have a look at an example that configures Jaeger OpenTelemetry collector:

Enables Elasticsearch backend with URL http://elasticsearch:9200, 3 primary shards (default is 5) and 2 replica shards (default 1)
Disables batch processor (enabled by default). It’s disabled because it is not specified in the service.pipelines.traces.processors array.
Enables attribute processor (disabled by default). Note that new components have to be explicitly added to the pipeline.

docker run --rm -it -v ${PWD}:/config \
    -e SPAN_STORAGE_TYPE=elasticsearch \
    jaegertracing/jaeger-opentelemetry-collector \
    --config-file=/config/config.yaml \
    --es.server.urls=http://localhost:9200 \
    --es.num-shards=3

The content of config.yaml:

exporters:
  jaeger_elasticsearch:
    es:
      server-urls: http://elasticsearch:9200
      num-replicas: 2processors:
  attributes:
    actions:
      - key: user
        action: deleteservice:
  pipelines:
    traces: 
      processors: [attributes]

The storage exporter can be configured via the same environmental variable SPAN_STORAGE_TYPE as Jaeger collector or it can be specified in service.pipelines.traces.exporters.

Jaeger Operator

Because the new components can be considered as almost drop-in replacements for existing Jaeger binaries we will be able to directly use them in the Jaeger Operator, by explicitly providing the image name in the CR. The required change in the Jaeger Operator is to expose the OpenTelemetry configuration in the CR. At the moment this is just a design proposal that is being discussed in the issue jaeger-operator/issues/1004.

Following the same configuration as in the previous section, the OpenTelemetry configuration is directly embedded into Jaeger collector node:

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: simple-prod
spec:
  strategy: production
  collector:
    config: |
      exporters:
        jaeger_elasticsearch:
          es:
            server-urls: http://elasticsearch:9200
            num-replicas: 2
      processors:
        attributes:
          actions:
            - key: user
              action: delete
      service:
        pipelines:
          traces:
            processors: [attributes]
  storage:
    type: elasticsearch
    options:
      es:
        server-urls: http://localhost:9200
        num-shards: 3

Conclusion

We have explained how the Jaeger project integrates with OpenTelemetry collector and what are the key differences between these components. Share your feedback with us and try our new collector based on OpenTelemetry.

References

Jaeger OpenTelemetry collector source code: https://github.com/jaegertracing/jaeger/tree/master/cmd/opentelemetry-collector
Jaeger OpenTelemetry collector docker image: https://hub.docker.com/r/jaegertracing/jaeger-opentelemetry-collector
OpenTelemetry collector: https://github.com/open-telemetry/opentelemetry-collector