Automatic Instrumentation of a Python Flask application using OpenTelemetry with Jaeger

soner durum
Insider Engineering
6 min readDec 18, 2023

--

We can start simply with what OpenTelemetry is.

OpenTelemetry is an open-source project that provides a set of APIs, libraries, agents, instrumentation, and instrumentation standards for observability in software systems. The goal of OpenTelemetry is to make it easier for developers to instrument, observe, and analyze the behavior of their applications in a standardized and vendor-agnostic way.

Key components of OpenTelemetry include:

  1. Tracing: OpenTelemetry allows you to trace the flow of requests across various services and components of a distributed system. Tracing helps identify performance bottlenecks and understand the interactions between different parts of an application.
  2. Metrics: OpenTelemetry provides a framework for collecting and exposing metrics from applications. This includes standard metrics like CPU usage, memory consumption, and custom application-specific metrics.
  3. Context Propagation: OpenTelemetry enables the propagation of context information (such as trace and span identifiers) across different services, allowing for correlated traces and unified monitoring of distributed systems.
  4. Instrumentation Libraries: OpenTelemetry offers instrumentation libraries for various programming languages, making it easier for developers to instrument their code without building custom solutions for each application.
  5. Exporters: OpenTelemetry supports various exporters to send collected telemetry data to observability backends. Common exporters include Jaeger, Zipkin, Prometheus, and more.
  6. Open Standards: OpenTelemetry is designed to be vendor-agnostic and follows open standards, allowing for interoperability between different observability tools and platforms.

The project was formed by merging two separate projects, OpenTracing and OpenCensus, to create a unified, community-driven standard for observability instrumentation. It is often used in microservices architectures and cloud-native applications to gain insights into the performance and behavior of complex, distributed systems.

What is Observability?

Observability is the ability to gain insights into the internal state of a system or application. This concept signifies the capability to understand and analyze events and conditions within a system. The primary goal of observability is to comprehend the complexity of a system, detect errors, and resolve performance issues.

Observability comprises three fundamental components:

  1. Logs: Logs encompass events, errors, and other crucial information in an application or system. Logs serve as a critical source for debugging and performance analysis.
  2. Metrics: Metrics measure the value of data generated (typically a performance indicator) by a system or application over a specific time frame. Metrics such as CPU usage, memory consumption, and processing times are vital for analyzing system behavior.
  3. Traces: Traces provide data that illustrates the step-by-step lifecycle of a process. They are essential for understanding why a process might be experiencing delays.

Incorporating observability practices into your development and operational workflows empowers you to proactively identify and address issues, ensuring the reliability and efficiency of your systems.

Instrumentation with OpenTelemetry

In order to make a system observable, it must be instrumented: That is, code from the system’s components must emit traces, metrics, and logs.

Automatic Instrumentation: OpenTelemetry simplifies the process of instrumenting applications by offering automatic instrumentation for various programming languages and frameworks. Through its automatic instrumentation capabilities, OpenTelemetry can automatically collect and propagate trace and metric data without requiring developers to add instrumentation code manually. This allows for a seamless integration of observability features into your applications.

Manual Instrumentation: For scenarios where more fine-grained control is needed or when dealing with custom frameworks, OpenTelemetry provides support for manual instrumentation. Developers can explicitly add instrumentation code to specific parts of their codebase, enabling the collection of custom traces and metrics. This manual approach offers flexibility and customization, allowing developers to focus on instrumenting the critical components of their applications.

In essence, OpenTelemetry accommodates both automatic and manual instrumentation, catering to diverse application architectures and developer preferences. This flexibility ensures that observability features can be easily integrated into applications, regardless of their complexity or specific requirements.

Automatic Instrumentation with Python

OpenTelemetry streamlines the process of instrumenting Python applications through automatic instrumentation. By leveraging OpenTelemetry’s Python instrumentation library, developers can effortlessly integrate tracing and metrics into their applications without the need for extensive manual intervention.

To enable automatic instrumentation in Python, you typically need to:

pip install opentelemetry-distro opentelemetry-exporter-otlp
opentelemetry-bootstrap -a install

This will install Flask instrumentation for our demo app.

from flask import Flask

app = Flask(__name__)


@app.route("/")
def hello_world():
return "<p>Hello, World!</p>"


if __name__ == '__main__':
app.run(debug=True, host='0.0.0.0')

Now we can run our application with the opentelemetry-instrument command shown below:

export OTEL_PYTHON_LOGGING_AUTO_INSTRUMENTATION_ENABLED=true
opentelemetry-instrument \
--traces_exporter console \
--metrics_exporter console \
--logs_exporter console \
--service_name flask-sample-server \
flask run -p 5000

Open http://localhost:5000 or send a curl request. You should see the spans printed in the console, such as the following:

{
"body": "127.0.0.1 - - [24/Nov/2023 17:57:50] \"GET / HTTP/1.1\" 200 -",
"severity_number": "<SeverityNumber.INFO: 9>",
"severity_text": "INFO",
"attributes": {
"otelSpanID": "0",
"otelTraceID": "0",
"otelTraceSampled": false,
"otelServiceName": "flask-sample-server"
},
"dropped_attributes": 0,
"timestamp": "2023-11-24T14:57:50.751923Z",
"trace_id": "0x00000000000000000000000000000000",
"span_id": "0x0000000000000000",
"trace_flags": 0,
"resource": "BoundedAttributes({'telemetry.sdk.language': 'python', 'telemetry.sdk.name': 'opentelemetry', 'telemetry.sdk.version': '1.21.0', 'service.name': 'flask-sample-server', 'telemetry.auto.version': '0.42b0'}, maxlen=None)"
}

OpenTelemetry Collector

OTEL Collector

Shortly, OpenTelemetry Collector offers a vendor-agnostic implementation of how to receive, process, and export telemetry data. It removes the need to run, operate, and maintain multiple agents/collectors.

When to Use a Collector:

Usability and Quick Start

  • Sending data directly to a backend is suitable for initial exploration and rapid value realization.
  • In development or small-scale environments, direct data transmission can yield satisfactory results without a collector.

Recommendations for General Use:

  • Using a collector alongside your service is generally recommended.
  • Enables quick offloading of data, with the collector handling additional tasks like retries, batching, encryption, and sensitive data filtering.

Ease of Setup:

  • Setting up a collector is straightforward; default OTLP exporters assume a local collector endpoint.
  • Launching a collector allows for automatic telemetry reception, streamlining the setup process.

Jaeger

Let’s talk about Jaeger a little bit before passing on to our example. Basically, Jaeger is a distributed tracing platform, and with that, you can:

  • Monitor and troubleshoot distributed workflows.
  • Identify performance bottlenecks
  • Track down root causes.
  • Analyze service dependencies

You can check here for more details. Technical Specs:

In our example, we proceed with the architecture below, but if you want to learn about other architectures, you can find them here.

Architecture

Demo Time

Please check the demo repo as a reference. Firstly, I built this demo in AWS EKS, but you are free to use your local Kubernetes cluster or other cloud service providers’ Kubernetes services.

Firstly, I created a new namespace, and I called it opentelemetry. After that, I applied all the yaml files under the yamls folder.

kubectl create ns opentelemetry
kubectl apply -f .

At this point, all the resources should be successfully created. If this is the case, you will see something like this:

Resource PODs in running state

If you want to deploy all these resources to another namespace, you need to change the namespace values, the OTEL collector’s endpoint in configmap.yaml, and the exporter endpoint in auto-instrumentation.yaml. Also, note that for automatic instrumentation, you need to add inject annotation in your application's deployment file.

annotations:
instrumentation.opentelemetry.io/inject-python: "true"

After port-forwarding to our flask application and Jaeger query(UI), we just sent a couple of requests for the creation of our tracing spans:

kubectl port-forward service/flask-demo 5000:5000 -n opentelemetry
kubectl port-forward service/example-jaeger-query 16686:16686 -n opentelemetry
curl -v localhost:5000

Finally 🎉!

Thank you for joining us on this exploration of OpenTelemetry and Jaeger. I hope you enjoyed this article. If you have any questions, please feel free to contact me on LinkedIn or comment below.

To delve deeper into the world of engineering and technology, stay tuned for more insightful articles on our Insider Engineering Blog. Discover the latest trends, best practices, and innovative solutions that drive our commitment to excellence in software development.

Happy coding! 👾

References

https://opentelemetry.io/docs/

https://github.com/open-telemetry

--

--