Investigative Phishing

How to automate the process of investigating reporting and identify wrongdoings by all “Big Fishes”


Sometimes you need about 5 seconds to find a good investigative story. It could be even faster; the time depends on the speed of your internet connection.

I am serious.

If you don’t believe me, go online to the property register of New York, type the name of a company from Cyprus called Prevezon and you’ll get 38 results. These will be the documents listing luxurious apartments on Manhattan purchased by Prevezon and its associated U.S. entities since 2009.

That is exactly what Bill Alpert, my friend and colleague from Barron’s magazine, did back in 2013. His simple click led to a joint investigation by Barron’s in New York, the Moscow-based newspaper Novaya Gazeta that I work for and the Organized Crime and Corruption Reporting Project (OCCRP), an Eastern European not-for-profit consortium of investigative reporters. 
The investigation showed that Prevezon purchased its apartments in New York in part with money from the notorious theft of $230 million from Russia’s Treasury in 2007. The one uncovered by the Russian lawyer Sergei Magnitsky, who died in prison in 2009.

Photo: Wikimedia Commons

The publication resulted in several criminal investigations into money laundering in Europe and a civil forfeiture case in the United States, which was settled finally in 2017. On the eve of the trial, Prevezon agreed to pay almost $6 million to the U.S. government without admitting guilt.

(Prevezon and its owner Denys Katsyv, a son of a former powerful Russian official, were represented by a lawyer named Natalya Veselnitskaya. A year before the settlement she met in New York with Donald Trump Jr. who had hoped to get from her “compromising information about Hillary Clinton,” as The New York Times reported in 2017. Veselnitskaya in turn hoped to get support from Trump Jr. for her attempts to cancel the Magnitsky Act.)

And that is just one example from hundreds of other good investigative stories that could be done by anyone who has an internet connection. 
Instead of typing Prevezon, you could type “Donald Trump” and get the list of property deals in New York executed by the president and members of his family.

Or you could go to hundreds of other similar online databases, such as OpenCorporates, the collection of company registers from all over the world. And by simply typing “Donald Trump,” you would get a long list of companies — from the notorious tax haven of Panama to Washington, D.C. — established by the president and his relatives. 
I hope that you get my point now. There are hundreds of important online databases and they contain thousands of important names that we can check.

This part of investigative reporting — going from one online tool to another and typing name after name — has always been messy. Investigative reporters call this process “phishing.” 
Phishing is great. And the Prevezon example shows that it can lead to stories with real impact. There is just one small problem: Sometimes it takes a long time to catch a “Big Fish” in this ocean of records. You have to waste your time typing name after name into database after database.

This is just one of the problems I hope to solve during my JSK Fellowship.

Problems We Need to Solve

Wasting time on repetitive operations is not the only problem in investigative reporting.

One of the most common claims I hear from people I write about — whether they are organized crime leaders or high-ranking officials — is: “Why do you write about me and not that guy? Your article was ordered by my competitors.”
And the only honest answer to this claim is: “I write about you, because today I caught you and not that guy.”

Investigative reporters are limited in time and resources and they can’t find all wrongdoings by all criminals and politicians in their countries. 
But is there a way to at least increase the number of wrongdoings we identify and cover?

And the last, but not least, problem: Investigative reporters often don’t manage to follow up their stories. 
Let’s get back to the example of Donald Trump and his companies all over the world. Say, today you decided to write a story about Panamanian entities that belong to the president’s son, Donald Trump Jr.

Photo: Gage Skidmore

But what if a month after you published your story Donald Trump Jr. decided to sell his companies? And what if the name of the buyer is important? How would you spot this story?

The majority of these databases don’t provide notifications, like for instance Google does. To spot this kind of story you’ll need to get back to the Panamanian company register and type the name of Donald Trump again. But you can’t do it every day with every name you are interested in and in every online database in the world, right?
Is there a way to constantly track changes about our “people of interest” and objects connected to them (companies or properties) in the real time?

A Possible Solution

I believe there is only one solution to these problems. We need to automate part of the process of investigative reporting, with a tool that would automatically execute some basic tasks. We need computers to free up time for journalists to do groundwork — talk to people, develop sources, find whistleblowers and get secret records from them. 
Such a tool would not only save the time of reporters, but it would also decrease the influence of luck on our stories. Computers can easily check for thousands of names in hundreds of databases, without missing anyone.

This tool would also solve the problem of following up on our stories after publication, by tracking changes in databases and notifying investigative reporters. So, if Donald Trump Jr. would, for instance, decide to sell his Panamanian companies, you wouldn’t miss this story. 
We need to teach this tool some basic algorithms, or some basic steps of investigative reporting. For instance:

Step 1: Go check what companies Donald Trump owns in the world.
Step 2: If there are Trump companies with other shareholders, go check to see if the names of shareholders appear in criminal records or sanctions lists. 
Step 3: If there is a company, for instance, in Panama, go check if it owns any daughter companies in the United States. 
Step 4: If there are U.S. subsidiaries, go check if they get any money or subsidies from the government. 
Step 5: Go check if these subsidiaries export/import any goods.
Step 6: Go and check who owns trading partners of these subsidiaries, and check if their names appear in criminal records, or sanctions lists.

And so on…

There could be dozens of similar algorithms that take steps that every investigative reporter repeatedly follows while working on a story.

I came to Stanford three months ago intent on building this tool, without any previous coding experience. Since then, I’ve taken computer science classes that helped me to realize that plan was too naive and self-confident. Iterations of nested “for-loops” are still magic for me. So as you can guess, the tool is still a dream on this stage.

So, the idea of this post is to find allies. If you also believe that investigative reporting is important, if you also think that the job of journalists is to keep powerful accountable, or you have coding experience and feel bored with your current job and are looking for a hobby, write me an email with your feedback and thoughts about the tool I’ve described. (“Your tool is a stupid idea,” is also a valuable opinion, if you specify why it is stupid.)

My email: