Kizceral: A Dynamic Realtime Architecture Diagram

Giannis Neokleous
Jan 9, 2018 · 4 min read

Giannis Neokleous, Paul Sastrasinh

Maintaining a large distributed service-oriented platform is hard. As both the organization and the architecture grows, and more features are added, it becomes increasingly harder to keep a mental picture of all the services and pockets of knowledge that start forming across the organization. Existing, manually drawn architecture diagrams become stale quickly, which makes it difficult for new employees to spin up on all the services. It also makes auditing service endpoints for security purposes impossible since our security engineers are unfamiliar with the codebases. To address these issues, we at Knewton have been building Kizceral, a project that came out of a hack day. Kizceral is a dynamic realtime service dependency visualizer based on Vizceral, leveraging tracing data generated by our in-house distributed tracing library TDist.

How it Works

Kizceral consists of two main parts. The first is a Java backend responsible for gathering tracing data and building a dependency graph. The second part is a React frontend used to visualize the graph.

When a user loads Kizceral, a backend request is made to build the service dependency graph. The graph is built by querying Zipkin for the list of service names and a list of spans for each of those service names. The Kizceral backend then looks at the spans (refer to Dapper data model explained here) and builds links by connecting each target service with its direct dependencies. The graph inference makes a best effort in cases where only partial tracing data is available (i.e., missing annotations, Kafka topics). A graph representation is then sent back to the UI along with all the RPC methods that each service exposes. Since this graph is built in realtime and includes current traffic information, it is different than what Zipkin can generate using the bundled offline aggregate job.

Here is a simplified view of all the dependencies:

The frontend is written in React and combines information from multiple sources in our infrastructure which allows users to view fine grained details about each service as they explore our architecture. For example from a service node in the UI a user is able to view:

Here is an example architecture diagram:

A user can select a service node and get a simplified view with all the upstream and downstream dependencies.

Challenges

Kizceral is a monitoring app, so it is designed to have as little impact on other running services as possible. Since thousands of traces are generated every second during peak traffic, we added a caching layer to cache the graph, the list of services and its details, and the service RPC methods. This was especially important at Knewton because the database cluster that stores tracing data is multitenant. Building an accurate dependency graph requires large amounts of data to be queried in real time, which can stress the database. For example, any service that has long running requests with each request having several thousand sub-requests could cause severe latencies in the database when fetching its list of traces making caching especially important.

Our solution was to design a cache which dynamically adjusts refresh rates and number of traces when querying service RPC data. Services which publish longer or more complicated spans are polled less often, since querying these services adds more load on our databases.

During development of Kizceral’s UI, we also noticed and fixed a small bug with Vizceral’s rendering and pushed a fix back upstream. Specifically, rendering on hi-dpi displays was blurry. Since Vizceral uses Three.js and WebGL under the hood, we traced this down to the device pixel ratio not being set properly. The end result: crystal clear architecture diagrams!

Future Work

We would like to see Kizceral as a valuable one stop bird’s-eye view of our entire platform. To achieve that, we would like to some additional features:

Knerd

The Knewton Blog - Stories about technology, product and…

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store