14 New Investigative Journalism Prototypes

What happened at the SZ Editors Lab? Teams from major European media outlets built prototypes to help journalists tackle investigations. Oh, and Edward Snowden made an appearance. Check him out here and check out the projects below.

Sarah Toporoff
Editors Lab Impact
9 min readNov 3, 2016

--

Facebook Monitor

WINNER — The team from Der Standard and derstandard.at describes their prototype which unearths deleted comments: “Facebook admins routinely delete comments on their page, if they don’t meet their standards. Or to silence critical voices. Facebook Monitor makes visible, which comments are unpublished. Our script creates a database with deleted comments. Each deleted comment also includes information about the user, timestamp, content of the deleted comment and a like count. Based on the like count we can also see how much attention a post or comment received before it has been deleted.

Bot-OX

SPECIAL MENTION — The team from ORF addressed crowdsourced investigation with a bot: “Bot-OX is a framework that allows ORF journalists to quickly generate Facebook Messenger chat bots, that can be used to collect all kinds of data or information from our users. With bot-OX we hope to bring crowdsourced investigations to a new level. Bot-OX helps to overcome two challenges in crowdsourced investigations: finding the right people that are able to contribute valuable data or information and motivating people to actually contribute to the “wisdom of the crowds” with their knowledge.” Try it out.

Follow the Money

Der Spiegel and Spiegel Online wanted to isolate money-related data in large sets: “Follow the Money identifies and visualises amounts of money in large, unstructured document collections. It searches for specific amount of money in documents (like “more than X million dollars”) and provides a visual interface using a network graph to get an understanding of money flows within datasets. Its an open-source solution, ready to be integrated within the forensic search tool Hoover.”

Unitj — Make Google Sheets journalism-compatible

The SRF data team harmonises data structures across sets with Unitj: “With our add-on to Google Sheets, we make Google Sheets journalism-compatible. With our tool, journalists are able to import/integrate their collected source sheets into one consistent master sheet that greatly helps them in cleaning their data and enforcing a certain data model. With a raw dataset that already has a consistent structure, it will be way easier later on to do further (automated) preprocessing, e.g. de-duplicating companies’ names, georeferencing adresses, etc.” See gif demo.

Policy Papertrails

The Neue Zürcher Zeitung team wanted to make parliamentary documents more searchable: “Policy Papertrails is a newsroom tool for researching the policy making process in the Swiss national parliament. It’s a searchable database of all public records from the Swiss parliament, as provided by parlament.ch Journalists can use the database for accessing background information or for generating leads to stories. The public can use the database to get a better understanding of how policy making processes work and how issues evolve over time.” Try it out.

Local Data Hub Munich

The merkur.de and tz.de team describes their Munich-focused prototype: “As newsroom for two regional news sites we collect data for our audience as user-generated content. We — for example — calculated the average rent for each part of the city form the data users gave as voluntarily. In addition, there is lots of open data regarding the city of Munich provided by the city of Munich itself. Now we created a platform, which combines both: our data about the city and open data. The hub is open for audiences and journalists as well. By combining two sets of data, one can create new knowledge about Munich. For example: If you combine the average rent for each part of the city with criminal stats, you can learn more about the impact of crime on rent.” Try it out.

PEPsickle

OCCRP’s team tames the wilds of Wikipedia’s multiple language entries with PEPsickle: “If you’ve just received a new leak and find yourself investigating a story in a country you know nothing about, where do you usually start? For most of us, it’s Google or Wikipedia. But what if you had a tool that allowed you leave your wiki adventures for those lonely Friday nights when there’s nothing to watch on Netflix, and let you immediately search for Persons of Interest in that country, giving you a list of not only politicians, business people and other public figures, but also their families and associates, AND it did it in multiple languages and alphabets.”

Measuring air quality in Germany’s dirtiest town

Stuttgarter Zeitung and Suttgarter Nachrichten address public health: “Air quality and particularly Feinstaub (fine particles) is a huge problem in Stuttgart, Germany. For years, the values have been exceeding maximum levels set by the EU. Excessive exposure to Feinstaub is proven to present severe health risks, including but not restricted to lung and heart problems. Our map, of which we present here a mockup and a functional prototype, visualises these live and historic data of the sensors. As soon as all sensors are in place, there will be extensive reporting in the newspaper and on the website. In cooperation with the robot journalism company AX Semantics (which is also located in Stuttgart, Germany), there will be automised local air quality reports and individualised newsletters for any reader who is interested.” Try it out.

The Conflict Of Interest Resolver

Non-profit investigative newsroom CORRECTIV made potential conflicts of interest more accessible: “The conflict resolver is a tool that scrapes scientific journals and extracts the statements on conflicts of interest. It can be used primarily by journalists that can evaluate experts’ credibility and possible bias. It can also be used by the public as a browser plug-in, when I am reading a story with an expert or if I want to check my doctors conflicts of interest.” Try it out.

State-O

The Bayerischer Rundfunk team created a tool to search state-owned companies: “State-O aims to shed light on stated-owned enterprises and offers information in a simple searchable database. The goal is to build a database of state-owned enterprises all over Europe to be used mainly by investigative journalists. In a first draft, State-O uses information on companies controlled directly or indirectly by the German federal government, published by the Federal Ministry of Finance in PDF format. Also direct ownership of corporates by the Austrian federal government is part of the already scraped dataset.”

Hoover

A team of French and Romanian innovators representing the European Investigative Collaborations network built new functionalities for Hoover, a search tool for large collections of documents: “Hoover is an open-source high-performance indexation and search tool made to browse terabytes of documents from any type (PDFs, pages, text, emails, images, archives, etc.). It aims to help journalists actually find information in huge datasets and collaborate by sharing documents with each other. It has been used by multiple newsrooms together to work at an international scale.” Try it out.

RP maps

Rheinische Post created a blind-spot finder for local reporting: “RP maps is an interactive map for the reporters in our newsroom at the Rheinische Post. The maps puts on display the amount of articles about the city of Düsseldorf, filtered by districts. Combined with actual population data (density) RP maps delivers intuitive insights to the editor about which districts might be underrepresented regarding the amount of reporting we do on it. The goal is to point out blind spots in our daily business, encourage editors to tackle and fill those blind spots by adjusting their investigative focus, and in the end enhance publication quality for all readers by increasing relevance.”

Better Polls Visualization

Host team Süddeutsche Zeitung worked on a better way to deal with opinion polls. “In autumn 2017 the next general election will be held. In the months to come, opinion polls play an even more important component of reporting about German politics. Traditionally, media outlets are reporting about in a new poll in the following style: If election would be held today, party x would get y per cent of the votes. This is a decline of z per cent compared to previous week.”

But by reporting this way, they can mislead their readers into taking specific polling figures as facts. They fail at communicating the amount of uncertainty that exist in any polls and more specifically the fact that polling figures are not absolute but just a mean inside a confidence level.

According to the Süddeutsche Zeitung team, the best way to communicate around polls is not to report on them individually but to aggregate them. “On an aggregate level (polls) provide valid information about potential voting patterns of the electorate. Therefore a smarter way of reporting about opinion polls is to get as many data as possible. The most comprehensive overview of German opinion polls can be found on Wahlrecht.de, a website about maintained by volunteers. We created the R package germanpolls for scraping this data. In order to offer a single value we compute a rolling average with a lag of 10. This is an easy implementation that can be extended, especially by weighing the individual polls.”

SimpLux

Le Monde built a tool to easily digest Luxembourg’s corporate registry data to look for shell companies: “During the Panama Papers investigation, we often had to search for corporations and societies related to Luxembourg, the closest European country to being a tax haven. Luxembourg has a corporate registry, but it is quite difficult to use, notably because you can’t search by a shareholder or beneficial owner name, only by corporation name. Our project is: To scrape the whole Luxembourg Corporate registry and get a local copy. To build a tool to explore these data. To enable advanced functions, like automated named entities detection, batch search, graph navigation, alerts, etc. As a user, I want to be able to search through the data, from a corporation name as well as from a person name. I want to be able to make batch search, and to obtain comprehensive and ordered results in order to help my investigations. With a “résidant à Paris” (“living in Paris”) or “de nationalité française” (“french nationality”) search, I can find every document of the Luxembourg corporate registry living in Paris. I can also operate a batch search with a csv list of names or companies.”

Some project descriptions have been lightly edited for length and clarity. See all original descriptions in the GEN Community.

--

--

Sarah Toporoff
Editors Lab Impact

Publisher Manager, Podinstall @BababamAudio. Previously @NETIA_software , #EditorsLab @GENinnovate . I always know where my towel is. (she/elle)