A brief history of our data visualization: From word documents to interactive graph databases

The Sentry
The Sentry Sidebar
Published in
6 min readJan 19, 2024

--

The Sentry’s hard-hitting investigations require immense attention to detail and deep analysis of complex relationships and storylines. The subjects of our investigations often attempt to obscure the truth, hide their involvement, and distance themselves from the allegations using a suite of tricks and tools. Our investigators face the challenge of exposing wrongdoing every day, and they need the right tools and data to make sense of the twists and turns as they dig into their stories. They also need resources to serendipitously stumble upon findings that expose a new layer, add a new character, or reveal a completely new storyline.

Arriving at the right resources to enable this kind of discovery has been an ongoing and iterative process. The Sentry’s current database — a neo4j graph database visualized using the platform Linkurious — contains over 150 million pieces of data and has proven to be an invaluable resource. But we didn’t start there, and the journey to our Linkurious database exemplifies both the importance of voluminous data collection and exploration in investigative reporting and the ways in which embracing new technologies and visualizations can accelerate and enhance analysis.

The dossier

The Sentry’s investigators evaluate scams and scandals involving dozens — sometimes hundreds — of companies and individuals. In the early days of the organization, to keep track of all the data and connections, The Sentry used dossiers — text documents that contained every last detail of each network. Meticulously created and updated, these dossiers spanned hundreds of pages and contained thousands of data points.

An example of a dossier. Photo: The Sentry.

A dossier contained entries for individuals, entities, and assets. Within each dossier entry were connections to other entries within the same document. For instance, if someone was a shareholder in a company, that company also had a profile in the dossier that listed that person as a shareholder. The connections were there, but it wasn’t always easy to wrap your head around them. It was even trickier to keep track of the less obvious connections, such as shared phone numbers or addresses, family or familiar relationships, and contracts between businesses. These details filled the notes sections, but there was always the concern that we were missing key connections or that we had forgotten to update a data point across the various listings to ensure comprehensive inclusion in every relevant entry.

The network visual

Dossiers gave us information, but they didn’t let us see it from another angle clearly. Given the level of complexity of The Sentry’s investigations, investigators needed some way of visualizing the networks they understood so deeply, especially if they wanted an audience to follow along with their stories. To help the reader understand how everyone was connected, The Sentry turned to network visuals produced using Omnigraffle — what we like to call “omnis.” While our reports dug into the evidence and laid out the details in depth, omnis quickly became a sharable hit. After all, seeing is believing.

For The Sentry, omnis became important complements to the dossiers. We added omnis to information and sanctions packages we submitted to our government partners, highlighting the critical connections that can sometimes get lost in the data and exposing the key enablers pinning networks together. We incorporated omnis into more and more of our reports: we featured 12 in “Making a Killing” in May 2020.

An omni graphic from The Sentry’s 2020 report, “Making a Killing.” Photo: The Sentry.

While omnis offered a visual representation critical to conveying our findings, our knowledge management capacities were still missing an upgrade. The kleptocracy networks at the core of The Sentry’s investigations often feature recurring characters, and sometimes these networks are connected to one another across borders. Dossiers with omnis stored information well for a single investigation, but collaboration was challenging. We needed a tool that would allow us to store our knowledge base, connect data points, and visualize networks in a more dynamic manner.

The graph database

Enter Linkurious, a visualization tool for graph databases. The software had been successfully used by the International Consortium for Investigative Journalists (ICIJ) for years, including as part of their work exploring the Panama Papers. Graph databases were born for this purpose. They could capture the depth of data we were storing in dossiers, while also visually representing relationships. The hundreds of pages of dossiers would turn into hundreds of nodes, connected with hundreds of edges, storing thousands of properties on both the nodes and the edges. Plus, with export capabilities from Linkurious to Excel, we always had the option to recreate dossiers when we needed them.

And so The Sentry’s neo4j database visualized within Linkurious was born. The early days were focused on creating a knowledge base for ongoing investigations. Our data team converted dossiers into spreadsheets based on the data schema and produced the corresponding nodes and edges within neo4j. Investigators then had access to the networks from their current work within Linkurious, and they could add nodes and edges manually as they continued investigating.

Sample simple graph model. Photo: The Sentry.

We had made omnis come to life. It was possible to edit and revise them in real time and to capture even the tiniest details within the many available properties for each node and edge.

This was cool. And it was starting to really pay off. Ever-changing networks, such as those used by war profiteers and sanctions evaders, were challenging to keep up to date, as they shift to avoid detection. With Linkurious, it was easier to see the connective tissue — phone numbers, addresses, enablers — that exposed new companies and members of the network. We could submit valuable intelligence to governments on these new suspected front companies and accomplices.

Example of a complex network graphic. Photo: The Sentry.

More data

With Linkurious, we were maximizing our ability to take advantage of the knowledge we already possessed and giving investigators new ways to analyze and make sense of their data. But the ultimate mission was still to reveal findings that investigators didn’t already know — findings that they hadn’t even considered. That was going to take data. Lots and lots of data.

We started importing the obvious: corporate registries. We’d never worked with such large data sets and crashed the system immediately, but it was worth the pain of sorting out our mistakes. The networks we had painstakingly added to the database manually now simply existed within it, ready to be explored. In less than a year, our database had gone from hundreds of nodes to data sets featuring millions of nodes.

Next, we began combining data sets, using POSSIBLE_MATCH edges to ease navigation. We wanted investigators to learn as much about a person or company as they could within the platform. We added lists of politicians, sanctioned actors, and military personnel. We layered in social media data that provided work experience and contact details. We used the custom actions feature to enable investigators to search names in other databases, such as OpenCorporates, in an adjoining tab. We couldn’t fit everything into our database, but we could make it accessible with the click of a button.

Today, our Linkurious database is not only a knowledge base storing networks exposed in over 50 Sentry reports, but also an investigative playground for finding new connections, discovering new insights, and analyzing data in new ways. Over 150 million nodes later, who knows what we will find next. As more journalism outfits and NGOs turn to graph databases, collaboration within the platform becomes easier, not only within our own organization, but with our partners, as well. The journey was long, but the results are magic.

--

--

The Sentry
The Sentry Sidebar

The Sentry is an investigative and policy team that follows the dirty money connected to African war criminals and transnational war profiteers. TheSentry.org