Using OpenTelemetry auto-instrumentation/agents in Kubernetes

Published in

OpenTelemetry

4 min readNov 22, 2021

In this article, I would like to introduce OpenTelemetry Operator’s new feature that significantly simplifies instrumenting workloads deployed on Kubernetes.

Instrumentation is the most tedious process when deploying an observability solution. There are multiple approaches how to instrument and application:

manual/explicit: source code is explicitly instrumented (e.g. by using OpenTelemetry API) or by using pre-built instrumentation libraries that are linked at compile time.
automatic: application is instrumented without any code modifications and recompilation of the application is not needed.

The automatic instrumentation was for a long time only available as a proprietary technology offered by various APM/observability vendors. OpenTelemetry changed this paradigm and made this technology available in open-source. Users get vendor-neutral instrumentation to avoid vendor lock-in for the most crucial part of the observability integration.

However, deploying auto-instrumentation at scale or validating proof of value can still be a tedious problem. Modern applications are packaged in immutable container images, hence adding instrumentation to already built images requires container rebuild. Rebuilding a large number of application images requires a major investment. Let’s have a look at how this problem can be solved on Kubernetes.

Instrumentation CR in OpenTelemetry operator

The OpenTelemetry Operator 0.38.0 introduced Instrumentation custom resource (CR) which defines the configuration for OpenTelemetry SDK and instrumentation. The instrumentation is enabled when an Instrumentation CR is present in the cluster and a namespace or workload is annotated:

kubectl apply -f - <<EOF
apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
  name: my-instrumentation
spec:
  exporter:
    endpoint: http://otel-collector:4317
  propagators:
    - tracecontext
    - baggage
    - b3
  sampler:
    type: parentbased_traceidratio
    argument: "0.25"
  java:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:latest
  nodejs:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:latest
  python:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:latest
EOF

At the moment the instrumentation is supported for Java, NodeJS, and Python languages. The instrumentation is enabled when the following annotation is applied to a workload or a namespace.

instrumentation.opentelemetry.io/inject-java: "true" — for Java
instrumentation.opentelemetry.io/inject-nodejs: "true" — for NodeJS
instrumentation.opentelemetry.io/inject-python: "true" — for Python

After the annotation is applied the operator injects OpenTelemetry auto-instrumentation libraries into the application container and configures the instrumentation to export the data to an endpoint defined in the Instrumentation CR.

Java example with Spring Petclinic

Now let’s deploy the Spring Petclinic Java application that will be instrumented and report data to an OpenTelemetry collector. The collector will log spans to the standard output.

Create the following deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: spring-petclinic
spec:
  selector:
    matchLabels:
      app: spring-petclinic
  replicas: 1
  template:
    metadata:
      labels:
        app: spring-petclinic
      annotations:
        sidecar.opentelemetry.io/inject: "true"
        instrumentation.opentelemetry.io/inject-java: "true"
    spec:
      containers:
      - name: app
        image: ghcr.io/pavolloffay/spring-petclinic:latest

and apply the instrumentation annotation:

kubectl patch deployment.apps/spring-petclinic -p '{"spec": {"template": {"metadata": {"annotations": {"instrumentation.opentelemetry.io/inject-java": "true"}}}}}'

After the annotation is applied the spring-petclinic pod will restart and the newly started pod will be instrumented with OpenTelemetry Java auto-instrumentation. The data will be reported in OTLP format to the collector at the address http://otel-collector:4317. The following code snippet deploys an OpenTelemetry collector that logs traces to the standard output.

kubectl apply -f - <<EOF
apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: otel
spec:
  config: |
    receivers:
      otlp:
        protocols:
          grpc:
          http:
    processors:

    exporters:
      logging:

    service:
      pipelines:
        traces:
          receivers: [otlp]
          processors: []
          exporters: [logging]
EOF

Now we can port-forward the application HTTP port via kubectl port-forward deployment.apps/spring-petclinic 8080:8080 and explore the application in WEB browser. The spans should be reported to the OpenTelemetry collector. The reported spans can be viewed via kubectl logs deployment.apps/otel-collector.

Auto-instrumentation is rather a complicated piece of software, the implementation depends on the language. In Java, the auto-instrumentation is called Javaagent and it does bytecode manipulation that injects instrumentation points to specific code paths. Then the instrumentation points create telemetry data (e.g. traces) when the code is executed. The auto-instrumentation is always supported for a defined set of frameworks or APIs. This is the list of supported frameworks and application servers for Java.

How is the injection logic implemented?

You might wonder how the injection of auto-instrumentation is implemented and especially how the operator can configure the application to use it. Let me explain it for Java, however, similar concepts are used for other runtimes as well.

The OpenTelemetry operator implements mutating admission webhook that is invoked when the Pod object is created or updated. The webhook modifies the Pod object to inject auto-instrumentation libraries into the application container, it configures OpenTelemetry SDK and runtime, in this case, Java virtual machine (JVM) to use the auto-instrumentation.

The auto-instrumentation library (Javaagent) is injected into the application container via an init container that copies the Javaagent into a volume that is mounted to the application container. The SDK configuration is done by injecting environment variables into the application container.

Now the final step is to configure the JVM to use the Javaagent. This is done by configuring the environment variable JAVA_TOOL_OPTIONS to use the Javaagent.

Conclusion

We have seen that using OpenTelemetry auto-instrumentation on Kubernetes workloads can be simplified with the operator pattern. No change to application code or container images is required. This approach brings huge value for quickly validating telemetry solutions. At the moment only Java, Python, and NodeJS runtimes are supported, however support for other languages is coming. The statically compiled languages like Golang or C++ can still use this feature to dynamically configure SDK. This essentially means the Instrumentation CR defines a control plane for OpenTelemtry SDK and instrumentations.

References

OpenTelemetry Operator: https://github.com/open-telemetry/opentelemetry-operator#opentelemetry-auto-instrumentation-injection
Instrumentation CR docs: https://github.com/open-telemetry/opentelemetry-operator/blob/main/docs/api.md#instrumentation
Supported frameworks in Java auto-instrumentation: https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/docs/supported-libraries.md#supported-libraries-frameworks-application-servers-and-jvms
Spring Pet Clinic: https://github.com/spring-projects/spring-petclinic
Kubernetes admission webhook: https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/
Java tools options: https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/envvars002.html