Navigating Graph Centrality
Introduction
Graph visualizations are frustratingly democratic, and should be more hierarchical. When navigating graphs, we are often tempted to view an entire forest when we intend to study one tree (and how it’s related to its neighbors).
This article builds off of explorations from a previous post, and is more focused on single entities within graph data structures. The thesis: in order to explore an entity within a graph, a user should be able to toggle between centralized and decentralized views.
Background: Enigma Verify
At Enigma, our Verify product is a tool that determines if a queried small or medium sized business (SMB) is legitimate. Under the hood, the API has a complex data infrastructure that links 72 datasets across 30 million companies. Naturally, we structure this linking system with a graph design, spanning across entities, relationships, and the attributes contained within.
Our product’s API endpoint is simple while the data science to achieve it is complex. We got a kick out of this contrast and spent some time creating a demo called Company Graph. This graph represents publicly available data for companies, locations, people, registrations, loans, inspections, and so on — all connected in a tangled web. The graph is based in an AWS Neptune cluster and queried with custom Gremlin endpoints. The front-end is built with a hybrid of React and D3.
Decentralized and Centralized Views
Below are two visualizations that represent the same graph data. Both have their benefits, and an important takeaway we learned is that neither visualization is a suitable interface by itself. By coupling the two, they complement each other to create a clearer picture.
Decentralized View
The image here is a familiar graph visualization. In this view, we’re focusing on a particular company called Econo Market, which is subtly highlighted as a larger green circle.
We’ve opened a few additional entities to increase graph depth, and can start to see the network related to the company of reference. The user can select additional nodes to add to the larger network, and each node is categorically colored by entity type.
In the centralized view above, we have many nodes visible on the screen, and can start to make sense of the network topology. However, given its non-linear landscape, it’s difficult to render the actual information for each entity. We can hover to see data for a particular node, but we can’t render all of the information in one view while maintaining legibility. In addition, Econo Market has no geometric priority on the screen, and could easily get lost in the weeds when the network increases in complexity. Let’s take a look at a contrasting view below.
Centralized View
The image here is a centralized view (tree view is a simpler name). Econo Market is given centrality and is promoted to the top of the screen, and all related nodes are structured based on their relationship to this central company.
Think about this as navigating a directory structure on your computer, where you can easily reference the depths of parent/child relationships while preserving interactivity.
In this case, the nodes tidy up into a tree-view. Duplicated nodes are avoided by diagonally linked relationships, and sorting is based on the weight of each edge. In simpler terms, this interface functions similar to a JSON Viewer, with expanding and collapsing entities and their attributes. We see more information in this view, coupled with what we’ve gleaned from the topology in the decentralized view.
Centralized and Decentralized Transitions
Rather than seeing these views as separate interfaces, we elected to merge them into a single workflow. The centralized and decentralized views should complement rather than compete.
In order to maintain a user’s intuitive read on the different landscapes, object constancy comes to the fore as an important data visualization principle. The user needs to be convinced that they are indeed dealing with the same graph structure when transitioning between views. That, or we just wanted to geek out on the front end and watch these nodes animate.
Explorations continue in search of balancing legibility with complexity.