Jaeger data analytics with Jupyter notebooks

Published in

JaegerTracing

4 min readMar 17, 2020

In the previous blog post Data analytics with Jaeger aka traces tell us more! we have introduced our data science initiative and platform. The ultimate goal is to develop new functionality within the Jaeger project based on AI/ML that will provide new insights into our applications. This type of functionality is also referred to as AI operations (AIOps).

Jupyter notebooks

Jupyter notebooks provide a simple user interface for experimenting with data. There are two main use cases that we want to accomplish with the notebooks:

an interface for data scientists to experiment with tracing data
on-demand incident investigation

The first use case is self-explanatory. It lowers the bar for non-infrastructure people to connect to the Jaeger server to consume and analyze the data. The second use case is more tricky. Imagine an incident that requires us to analyze a trace or set of traces with hundreds or thousands of spans. Such an analysis might not be feasible in the user interface. Instead, we could write code to verify our hypothesis. For this purpose we have developed Trace DSL based on graph query language Gremlin to simplify filtering and feature extraction of tracing data. Hence a Jaeger user would be able to spin up a Jupyter notebook on demand with the Trace DSL and write a query and analysis.

Jupyter notebook example with Jaeger

In this section we are going to deploy Jupyter notebook with Jaeger Trace DSL and write a simple query against the Jaeger server.

Let’s deploy Jaeger, HotROD example and Jupyter notebook with Trace DSL:

docker run --rm -it -p 16686:16686 --name=jaeger jaegertracing/all-in-one:1.17docker run --rm -it -p 8080:8080 --link=jaeger -e JAEGER_ENDPOINT=http://jaeger:14268/api/traces jaegertracing/example-hotrod:1.17
docker run --rm -it -p 8888:8888 -p 4041:4040 -p 9001:9001 --link=jaeger -e JUPYTER_ENABLE_LAB=yes quay.io/jaegertracing/jaeger-analytics-java:latest

Add -v ${PWD}:/home/jovyan/work to Jupyter notebook if you want to open the notebooks from your current directory. The notebooks are hosted in jaegertracing/jaeger-analytics-java repository.

Now open Jaeger UI at http://localhost:16686, HotROD example at http://localhost:8080 and Jupyter notebook at http://localhost:8888/lab. The token for the Jupyter lab is written in the Jupyter console logs.

For the analysis we have to generate some data, so in the HotROD UI click on the blue boxes to order a car that generates nice traces. To verify that the trace reached Jaeger open Jaeger UI and search for traces from thefrontend service. The trace should look like this:

Trace from the HotROD example application.

Once we know that the data is stored in Jaeger we can move to Jupyter notebook and load the trace there. Jaeger notebooks are stored in thejupyter directory. This directory can either be opened from the project root directory or from work in case the notebooks from host filesystem are injected into the docker container.

Before running the analysis we have to load dependencies into the notebook’s classpath. Just click on the dependencies cell to make it active and then on the play icon in the top navigation menu.

Jupyter notebook for loading data from jaeger-query.

Before running the code we have to update the variable traceIdStr to point into one of the traces we have generated earlier in the HotROD app.

The results are written below the code cell. In this case the trace has the height 3 and there is one calculated network latency between frontend and server service of 0.00102 ms . The latency is small because all services are run as part of the same process and there is no real network overhead.

The last code cell shows the direct use of Trace DSL with Apache Gremlin. Gremlin is a graph traversal language and class TraceTraversalSource.class it extends and adds methods for trace filtering and feature extraction. For instance, it adds a method like hasName(String name) to filter spans by operation name. In our example the query verifies whether two spans with given operation names are directly or indirectly connected, or in other words one is a descendant of the other.

Conclusion

We have seen how easy it is to deploy Jupyter notebook and write a simple query against Jaeger server using gRPC generated stubs. This feature provides a powerful interface to write a custom query and hypothesis analysis against traces retrieved from the Jaeger server. The example can easily be extended to collect a stream of traces from Kafka and experiment on the live data.

Any feedback is welcome! Get in touch with us on our Jaeger Gitter channel or simply open an issue and share your feedback or ideas.

References

Jaeger Java analytics: https://github.com/jaegertracing/jaeger-analytics-java
Data analytics with Jaeger blog post: https://medium.com/jaegertracing/data-analytics-with-jaeger-aka-traces-tell-us-more-973669e6f848
Apache Gremlin documentation: http://tinkerpop.apache.org/docs/current/reference/
Jaeger HotROD example application demo https://medium.com/opentracing/take-opentracing-for-a-hotrod-ride-f6e3141f7941

Jaeger data analytics with Jupyter notebooks

Jupyter notebooks

Jupyter notebook example with Jaeger

Conclusion

References

Written by Pavol Loffay