Distributed Tracing with Jaeger and OpenTelemetry in a Microservices Architecture

Ebubekir Dinc
7 min readDec 24, 2023

--

This article is part of my Microservices and Cloud-Native Applications series. You can find the other parts of the series below.

  1. Saga Orchestration using MassTransit in .NET
  2. API Gateway with Ocelot
  3. Authorization and Authentications with IdentityServer
  4. Eventual Consistency with Integration Events using RabbitMq
  5. Distributed Logging with ElasticSearch, Kibana, and SeriLog
  6. Resiliency and Fault Tolerance with Polly
  7. Health Check with WatchDogs in a Microservices Architecture
  8. Distributed Tracing with Jaeger and OpenTelemetry in a Microservices Architecture
  9. Metrics to Monitor Microservices with OpenTelemetry and Prometheus

If you want to take a look at the GitHub code, you can access it here: https://github.com/ebubekirdinc/SuuCat

One technique for tracking and profiling application performance, particularly in the context of microservices architectures, is distributed tracing. An application is separated into several tiny, autonomous services in a microservices architecture, which communicate with one another to accomplish a broader business objective. Understanding and analyzing the flow of requests as they pass through different microservices is made easier with the use of distributed tracing.

Distributed tracing provides end-to-end visibility into the entire flow of a request as it moves through different microservices. This visibility is essential for understanding the performance of the entire system and identifying bottlenecks or issues.

To facilitate observability in applications, OpenTelemetry is an open-source project that offers a collection of agents, libraries, APIs, instrumentation, and instrumentation guidelines. It makes it simpler for developers to monitor and understand the performance of distributed systems by assisting them in integrating distributed tracing, monitoring, and logging into their applications.

In our project, SuuCat, Distributed Tracing has been implemented using OpenTelemetry together with Jaeger.

Jaeger maps the flow of requests and data as they traverse a distributed system. These requests may make calls to multiple services, which may introduce their own delays or errors. Jaeger connects the dots between these disparate components, helping to identify performance bottlenecks, troubleshoot errors, and improve overall application reliability. Jaeger is 100% open source, cloud native, and infinitely scalable.

Jaeger can be installed using the following Docker files. More information about the installation is here: https://github.com/ebubekirdinc/SuuCat/wiki/GettingStarted

docker-compose.yml

https://github.com/ebubekirdinc/SuuCat/blob/master/docker-compose.yml
https://github.com/ebubekirdinc/SuuCat/blob/master/docker-compose.yml

docker-compose.override.yml

https://github.com/ebubekirdinc/SuuCat/blob/master/docker-compose.override.yml
https://github.com/ebubekirdinc/SuuCat/blob/master/docker-compose.override.yml

We will have a shared project called Tracing, and other microservices will reference this project. Nuget packages for this project will be as follows.

https://github.com/ebubekirdinc/SuuCat/blob/master/src/BuildingBlocks/Tracing/Tracing.csproj
https://github.com/ebubekirdinc/SuuCat/blob/master/src/BuildingBlocks/Tracing/Tracing.csproj

In the Tracing project, we have an AddOpenTelemetry method which is a static extension method for the IServiceCollection interface. This method is used to configure and add OpenTelemetry to microservices; this is done in the Startup.cs file during the initialization process of each microservice. It helps to add sources for tracing and configuring the resource with the service name and version.

https://github.com/ebubekirdinc/SuuCat/blob/master/src/BuildingBlocks/Tracing/OpenTelemetryExtensions.cs
https://github.com/ebubekirdinc/SuuCat/blob/master/src/BuildingBlocks/Tracing/OpenTelemetryExtensions.cs

Every microservice needs to be instrumented in order to enable distributed tracing. This involves adding code to capture timing and contextual information about requests. This information is then used to create a trace that shows the path of a request as it moves through the system.

The AddAspNetCoreInstrumentation() method is used to add ASP.NET Core instrumentation to the OpenTelemetry pipeline. This instrumentation collects telemetry data about incoming HTTP requests and outgoing HTTP responses in an ASP.NET Core application.

o.Filter: This is a predicate to filter requests by the path that determines whether a given HTTP context should be traced. Here, it’s set to only trace requests that contain “Api” in the path.

o.EnrichWithHttpRequest: This enriches the activity with additional information from the HTTP request. Here, it’s adding a tag with the request protocol.

o.EnrichWithHttpResponse: This enriches the activity with additional information from the HTTP response. Here, it’s adding a tag with the response length.

o.RecordException: This is a setting that determines whether unhandled exceptions should be automatically recorded.

o.EnrichWithException: This enriches the activity with additional information from any unhandled exceptions. Here, it’s adding tags with the exception type and stack trace.

https://github.com/ebubekirdinc/SuuCat/blob/master/src/BuildingBlocks/Tracing/OpenTelemetryExtensions.cs
https://github.com/ebubekirdinc/SuuCat/blob/master/src/BuildingBlocks/Tracing/OpenTelemetryExtensions.cs

The AddEntityFrameworkCoreInstrumentation() method is used to add Entity Framework Core instrumentation to the OpenTelemetry pipeline. This instrumentation collects telemetry data about database operations performed using Entity Framework Core in a .NET application.

AddConsoleExporter() is used to display tracing data in the console. AddOtlpExporter() is used to display data on Jaeger.

Jaeger SQL statement
Jaeger SQL statement

As shown in the above Jaeger screen the generated SQL statement can also be investigated.

If you want to manually log trace data, you can do as below.

https://github.com/ebubekirdinc/SuuCat/blob/master/src/Services/Identity/Controllers/AuthController.cs

And the result in Jaeger is like this.

Jaeger adding manual trace log
Jaeger adding manual trace log

So far, this was tracing information in a single microservices. Now let’s look at the asynchronous processing of a request in one microservice in another microservice.

In our project, when a user record is created in the Identity microservice, it must be reported to other microservices. For example, the Account microservice should create the new user's information in its own database. With distributed tracing, we can trace a request that starts in one microservice to other microservices. Where and how long it takes, which SQL requests are made, exceptions, we can observe everything from a single point. This is a simple example of eventual consistency. For more information see my article here.

You can see an example of this below. Here we see the journey of a request between microservices.

Distributed tracing between microservices.

So, how is the communication between microservices managed?

OpenTelemetry defines a trace context, which includes a trace ID and span ID. These identifiers are propagated across different microservices and components as a request traverses through the system. This trace context is crucial for correlating spans and reconstructing the full trace.

If you want to see the logs of the same traceId in ElasticSearch/Kibana, you can achieve this with a middleware. In the middleware you see below, it retrieves the current TraceId from the Activity class, which represents a unit of work or operation. This TraceId can be used for correlating logs, traces, and other telemetry. The TraceId is then used to begin a logging scope with _logger.BeginScope. Logging scopes add additional data to every log event that is created within the scope.

https://github.com/ebubekirdinc/SuuCat/blob/master/src/BuildingBlocks/Tracing/LogContextMiddleware.cs
https://github.com/ebubekirdinc/SuuCat/blob/master/src/BuildingBlocks/Tracing/LogContextMiddleware.cs

After adding traceId to Middleware, we can see the traceId field in Kibana as follows.

TraceId in ElasticSearch/Kibana
TraceId in ElasticSearch/Kibana

We can then copy this traceId and paste it into the search box in Jaeger, where we can also access the trace data. This way we can trace with traceId on other platforms as well, and this makes our work much easier.

TraceId in Jaeger
TraceId in Jaeger

Jaeger has different screens where you can monitor trace data. On the Trace Span Table page, we can see all spans in a trace.

Jaeger Trace Span Table
Jaeger Trace Span Table

Here you can see the Trace Flamegraph.

JaegerTrace Flamegraph.
JaegerTrace Flamegraph.

And Trace Statistics.

Jaeger Trace Statistics
Jaeger Trace Statistics

In conclusion, OpenTelemetry offers a standardized collection of tools and libraries that make it easier to integrate distributed tracing into applications. Because of its adaptability, community support, and integration possibilities, it’s a well-liked option for developers who want to improve distributed system observability.

For distributed tracing, there are several options for Jaeger, each with unique advantages and characteristics. Some are paid and some are free. Alternatives to Jaeger that are also widely used for distributed tracing include Zipkin, Azure Application Insights, and New Relic.

More info can be found in the Jaeger docs, and SuuCat GitHub.

References:

https://vgaltes.com/post/forwarding-correlation-ids-in-aspnetcore-version-2/

--

--