Spring Cloud Microservices — Part 4— Distributed Tracing with Sleuth and Zipkin

Okan Ardıç
5 min readOct 12, 2022

--

Photo by Marten Bjork on Unsplash

Introduction

To see the complete list of Spring Cloud Microservices tutorial series, you can check this link.

In this part of the Spring Cloud Microservices series, I will explain the distributed tracing and how it can be managed in a microservices environment.

Let’s consider we have a bunch of chained requests (Service A calls Service B, Service B calls Service C and so on). It is quite useful to find out the answers to questions such as who called who, at what time, what was the result of the execution etc. Zipkin and Sleuth answers these questions.

Zipkin and Sleuth are used to be able to trace requests especially for distributed applications. Sleuth is used to prepare and send traces from our services to Zipkin, and we can search and analyze the traces on Zipkin UI.

Applications can directly send traces to Zipkin, but this will result in data loss, in case the connection between the application and Zipkin is broken, or Zipkin server goes down. For such cases, you can set up a distributed queue implementation such as Kafka or RabbitMQ as a middleware which will store the traces, and Zipkin will eventually retrieve these traces whenever it gets available.

In this project we will be integrating RabbitMQ with Sleuth and Zipkin.

1- Installing RabbitMQ

First things first, let’s start by installing RabbitMQ. You can refer this link to download and install RabbitMQ. If you already have Docker installed, it’s easy to start a RabbitMQ instance inside a container with the following command:

docker run -it —-rm —-name rabbitmq -p 5672:5672 -p 15672:15672 rabbitmq:3.10-management

This command will download the Docker image (if not downloaded yet) for RabbitMQ and run it. Port 5672 (default) is used for messaging and port 15672 is used management UI, so we are exposing these ports to the host OS to be able to publicly access them. —-rm argument will cause the container to be removed upon stopping the container, since we are using this command only for test purposes.

After starting RabbitMQ, you can navigate to http://localhost:15672/ to see RabbitMQ management page. You should see a page similar to the following:

RabbitMQ Management Page

2- Installing Zipkin

You can refer this link for the installation details or continue reading below to start Zipkin with RabbitMQ integration quickly.

To run Zipkin via Docker on Windows or Mac, just run the following command:

docker run --rm -d -p 9411:9411 -e RABBIT_URI=amqp://guest:guest@host.docker.internal:5672 openzipkin/zipkin

For Linux, you can use a slightly different command:

docker run -—add-host=host.docker.internal:host-gateway --rm -d -p 9411:9411 -e RABBIT_URI=amqp://guest:guest@host.docker.internal:5672 openzipkin/zipkin

After starting Zipkin, you can navigate to http://localhost:9411 to see if Zipkin UI is displayed correctly.

If both Zipkin and RabbitMQ starts correctly, you should see a queue named zipkin under the Queues tab of the RabbitMQ management page.

Queues tab displayed on RabbitMQ UI

3- Setting up the Project

1- Add the following dependencies to your service’s pom.xml:

<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-sleuth</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-sleuth-zipkin</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.amqp</groupId>
<artifactId>spring-rabbit</artifactId>
</dependency>

2- Configure application.yml as follows:

spring:
application:
name: <service-name>
zipkin:
baseUrl: http://localhost:9411
sender:
# web, rabbit, activemq or kafka
type: rabbit
sleuth:
sampler:
probability: 1.0
rabbitmq:
host: localhost
port: 5672
username: guest
password: guest

spring.zipkin.baseUrl: Zipkin URL.

spring.zipkin.sender.type: Defines where to submit the traces (one of web, rabbit, activemq and kafka). When set to web, traces will be directly sent to Zipkin, but this will lead to loss of traces, if Zipkin is down or unreachable. So, it is a good practice to use a queuing mechanism such as Kafka, RabbitMQ or ActiveMQ.

spring.sleuth.sampler.probability: Takes a value between 0 and 1.0. Lets you decide what percentage of the requests should be sent to Zipkin. 1.0 means 100% of the requests, 0.1 means 10% of the requests etc. You should consider setting an appropriate value depending on the traffic, since setting it to a high value might produce high traffic, and likewise setting it to a too low value might not provide enough information.

spring.rabbitmq.* properties will be used to set up RabbitMQ integration such as host, port and credentials.

4- Analyzing the Traces

Let’s see an example screenshot from Zipkin below to learn more about how a trace is displayed:

A trace displayed on Zipkin

In the sample trace, there are 3 separate RPCs; the first request is a GET request made towards API-GATEWAY (see Tags on the right) service via the path /users/1. It then calls USER-SERVICE and USER-SERVICE calls ORDER-SERVICE respectively. Each of these calls is called a span. Each span has a unique ID and is attached to the parent span by its parent ID. All of these spans share the same trace ID.

In short terms, the whole request chain is called a trace and each event within the trace is called a span.

When you click on each line, you can even see the Controller class that was called on the target service. For example, when you click on the line with ORDER-SERVICE, the output will be like below:

Zipkin Tags for Order Service

And when you click on Show All Annotations button, you will see the request/response times of the requests relative to the beginning of the incoming request:

Zipkin Annotations for Order Service

In the above image, user-service, as a client, makes a request to order-service (Client Start), order-service receives the request and starts processing (Server Start), order-service completes the execution and returns the response (Server Finish), and finally user-service returns the results to the caller (Client Finish).

Conclusion

In this part of the series, I mentioned about how distributed tracing can be managed using Zipkin. Besides, we have also seen how RabbitMQ can be used as a middleware to store messages, if the traces could not be delievered to Zipkin due to some application or connectivity issues.

In the next tutorial, we will discuss about how to implement Circuit Breaker which is a quite common pattern to short-circuit repeatedly failing requests.

Next Tutorial: Implementing Circuit Breaker with Resilience4j

Source Code

You can download the complete source code of this tutorial series from this link.

References

https://cloud.spring.io/spring-cloud-sleuth/reference/html/appendix.html

https://www.rabbitmq.com/download.html

https://zipkin.io/pages/quickstart.html

https://github.com/openzipkin/zipkin/blob/master/zipkin-collector/rabbitmq/README.md#configuration

--

--