Search: “The old Google-like list of results is definitely not the future”
In the lead up to the Süddeutsche Zeitung Editors Lab in Munich, which will focus on investigative journalism, I had a talk with Sébastien Heymann, CEO of Linkurious, a tool designed to help the user search and visualise data.
Bertrand: You participated at which stage of the Panama Papers inquiry? What was the role of the data visualisation software you provided to ICIJ?
Sébastien: We have been in touch with the ICIJ since the Swiss Leaks investigation in 2015. When the Panama Papers investigation started, the ICIJ reached out and asked to access our Linkurious Enterprise software. The 370 journalists ICIJ gathered for the investigations were able to use Linkurious to search the Panama Papers.
Linkurious Enterprise helped identify high profile individuals and the accounts, companies and middle-men they were linked to.
How did the less tech-savvy journalists uncover the hidden insights among the 11 million documents and the 214,000 organisations mentioned in the leaked documents?
Linkurious Enterprise is very intuitive to use. Our goal is to empower non-technical people and help them find insights hidden in large, connected, and complex datasets.
In the case of the Panama Papers, the journalists could simply use our search engine to look for relevant information: a name, an address, a company. Linkurious Enterprise then displayed the results and allowed the journalists to deep dive into the relationships between these entities.
Do you see data visualisations as more of an internal tool to help journalists digest large quantities of information or as an audience-facing end product? In your mind, can a relevant dataviz replace a classic article?
I think dataviz works in both contexts. Journalists are getting more and more mature in the way they use various data analysis and visualisation tools. Sometimes this is something that happens in the background, without the public knowing about the technical details behind the articles they are reading.
Big data techniques can speed up investigations, reduce costs, and enable efficient collaboration among journalists.
How it works is simple:
- Machines read lots of documents to extract data
- Humans find clues within data through the lens of dataviz
- Humans read related documents. The Panama Papers investigation is a good example, as it could not have been possible without the use of technology from the start. We believe that it is a glimpse of the future of investigative journalism.
What’s also exciting is that we are seeing more and more news organisations use visualisations to communicate the results of their investigations, such as the Offshore Leak Database. This is important because it helps us understand the complexity of our world. Dataviz can illustrate articles and provide extended information on demand; however, visualisation is never self-explanatory; it won’t replace the need for writing stories and giving curated insights.
What is the difference between a data visualisation and a sophisticated infographic? How can media organisations take advantage of these tools?
I consider that data visualisation is a tool to produce new information, while infographics are a way to present data for consuming information. They both use visual representations to convey meaning and both can be interactive. The reader is actively searching for information in data visualisations, but he or she is passively reading curated insights from infographics.
Media organisations should therefore use data visualisation to find new stories and prepare articles, then publish carefully crafted infographics unless their audience is eager for deep analysis.
What are the improvements to Linkurious since the beginning of 2016, especially regarding collaboration between hundreds of journalists? What lessons did you learn from the Panama Papers?
Working with the ICIJ has been a tremendous help for us. We have been able to use their feedback to make Linkurious Enterprise even better. In the last few months we have for example improved the way users share visualisations. More recently, we have introduced the ability to visualise connected data in geographic maps and have improved our search feature for non-English languages such as Arabic.
The Panama Papers also gave us an incredible exposure which helped attract anti-financial crime teams around the world. Now we are also working with tax authorities, insurance companies, banks and money transfer companies.
How secure were the exchanges between data journalists during the Panama Papers inquiry?
Security was paramount in the investigation as the data was extremely sensitive and could be of interest to state-level actors. The biggest challenge was the number of people involved in the investigation and how distributed they were. You had hundreds of journalists around the world including in high risks areas. Without giving away too many details, we can say that Linkurious Enterprise helped by providing a secure access to the data and encrypting the communications. Even with that, it’s incredible that the story did not leak until the official publication!
We are now working to improve our security further by developing a fine-grained access rights management system.
Can you change the way people do search queries? Do you imagine a visual search engine? And are you in contact with Google or other search engines to change a list of links into a smart dataviz?
One of the big trends in search is mobile. Companies like Google, Apple or Facebook are trying to find new ways to surface information. The old Google-like list of results to a query typed by the user is definitely not the future here. The goal is to provide a single answer, ideally without the question even being asked.
At the same time, there are cases where it is important to provide people with options to analyse data and make smart decisions. I think visualisation will play a greater role here in the future. Look, for example, at the way Airbnb provides various ways to analyse data to its users, combining a list of results with data visualisations, such as histograms of price distribution and geospatial maps.
With Linkurious Enterprise we provide a visual search engine for any connected data. For instance, NASA uses it to find experts and important documents in databases by connecting authors, topics, location, and sources.
As a startup, you are looking for funding. How much are you asking for and what will the money go towards? What will Linkurious look like in 2020?
Right now we are expanding the company, adding more people as our number of customers grows quickly. We are looking to raise money to accelerate our growth and perhaps open a US office. [Linkurious is based in France]
In 2020 we would like to see our technology in the hands of everyone and used to help with today’s most pressing data challenges: cyber-security, financial crime, medical research, and investigative journalism.
Sébastien is the CEO of Linkurious. He has long standing passion for democratizing network thinking. He co-founded Gephi, the leading open source software for exploring networks of all kind in 2008. Since then, he has published, written and talked extensively about the impact of graph visualisation.
Peter Aldhous — BuzzFeed
“I don’t see data journalism and other forms of journalism being separate. I see them being totally intertwined.” (Euroscientist, 30 June 2016)
Sarah Cohen — The New York Times
“Stories about bedbugs that are called investigative journalism are kind of silly. But not everything has to be a six-month project. The core of investigative work is something of public importance that somebody doesn’t want you to know.” (The Washington Post, 7 August 2016)
Friedrich Lindenberg—OCCRP
“Often as a journalist, you want to find out ‘Where can I find information about this person or this company? What you want then is a place where you can search as many data sources as possible. That’s why we’re bringing together a lot of government data, corporate records and other kinds of information from previous investigations that we have exclusive access to, and all of that is searchable.” (IJNet, 17 October 2016)