The WSJ on DeepFakes: ‘It’s a cat & mouse game’

Francesco Marconi, The Wall Street Journal’s R&D Chief, shares key learnings from the WSJ Media Forensics Committee.

Ana Lomtadze
Oct 10, 2019 · 11 min read

A year ago, The Wall Street Journal created a Media Forensics Committee consisting of 21 people with the goal of developing resources on deepfakes and providing direct verification assistance to reporters. GEN spoke with Francesco Marconi, WSJ’s Research & Development Chief, to learn more about the inside-workings of the committee as well as its lessons learned. While awareness in newsrooms is crucial, Marconi admits that deepfake detection is like a ‘cat & mouse game’. That said, he shares his thoughts on how newsrooms, including smaller ones, can tackle the issue and what the role of tech companies should be. Importantly, he explains the particularities of deepfakes for different media types and the risk of ‘real-time deepfakes’.

Attend a GEN study tour and stay in the vanguard of the field!

GEN: Since the fall 2018, the Wall Street Journal has a committee of 21 people to serve as an in-house verifier of and resource on deepfakes. What are the backgrounds of these committee members and how exactly is their work integrated within the newsroom?

Francesco Marconi: The WSJ Media Forensics Committee consists of more than 20 reporters and editors from different parts of the newsroom (editorial, photo, video, product, R&D, audience & analytics, standards & ethics). Each of them is on-call to answer reporters’ queries about whether a piece of content has been manipulated. After each query from a reporter and subsequent analysis, members write up a report with details of what they learned.

This initiative is actively training journalists by showing them examples of altered videos, photos and audio files and teaching them deepfake detection methods. This includes going through research techniques that allow them to trace the origin of footage, understanding video manipulation skills that uncover tell-tale signs that the video has been altered with the help of AI.

This is an ongoing effort as we are regularly looking for new media forensics research as well as tools — and updating our training materials to make sure that our detection methods are state of the art.

What are the three major lessons learned since the creation of the WSJ Media Forensics committee a year ago?

  1. General awareness in the newsroom about the issue of deepfakes is crucial (we regularly host training sessions and seminars with guest speakers)
  2. It’s a cat & mouse game: even if we develop new verification approaches, new technology will always find a way to make detection more challenging. In that regard, it’s important to keep up with the evolution of misinformation and anticipate what might come next.
  3. Journalism processes and standards don’t change: although we are talking about advanced technology, the foundation of what we do doesn’t change — checking origin and trustworthiness of the source, doing background research, compare information etc.

How does the committee categorise deepfakes? What kinds of technology are you most worried about and why?

Content generated with the help of AI can be broadly called Synthetic Media. When that process is used to deceive audiences, we call these synthesized pieces of content deepfakes. Deepfakes can be segmented into fake video, images, audio and text.

While most of the conversation around deepfakes is focusing on video right now, we have also seen audio deepfakes improve rapidly. The first examples of AI-generated audio that we heard sounded very robotic and were easy to identify as fakes. One of the latest examples, a virtual copy of the voice of podcast host Joe Rogan, sounds eerily life-like. It is very complicated to tell it apart from Rogan’s real voice. This technology could be used for fraud. In August, the WSJ reported the case of criminals who used AI-based software to mimic a chief executive’s voice demanding a fraudulent transfer of €220,000.

Over the past few months,there have been several examples of a different form of doctored video, so-called ‘cheapfakes’ or ‘shallowfakes’. These are videos that have not been altered with the help of artificial intelligence, but with rudimentary video editing tools. A good example of that is a video of House Speaker Nancy Pelosi that appeared on social media in May. Forgers slowed down footage of Pelosi speaking at a conference and readjusted the pitch of her voice. This made it appear like she was intoxicated.

Even though it was relatively easy to spot that this was doctored footage, the video spread widely on social media. This shows that it does not take much to fool some online users. Those are the effects of something that experts call ‘confirmation bias’: When people see a video that seems to prove something that they already believe, they are more likely to think that the footage is real and to share it online.

This infamous video of Nancy Pelosi on Facebook spurred the same old question about the balance between freedom of speech and the fight against misinformation. How could the fact that deepfake is based on AI and machine learning reframe the question of freedom of speech?

As mentioned, the Pelosi video is not a deepfake since it doesn’t falsely show her saying or doing anything she didn’t actually say and it’s wasn’t created using deep learning (which is where the “deep” part of deepfake comes from). It’s a ‘shallow fake’ in which the footage is just slowed down and taken out of context to make it misleading. A DeepFake refers to ‘synthetic media’ which is modified through AI, specifically generative adversarial networks.

The second issue is related to whether social networks should take the video down. It’s not as simple as it may seem. Broadly speaking, could content of this nature be considered satire? Where should the line be drawn? We have seen that platforms like Facebook, Youtube and Twitter have found different answers to these questions so far: In the case of the Pelosi video, Facebook decided to leave it on its website, downrank it and add a note for viewers that it has been debunked. Youtube decided to pull it. Twitter left it on its platform.

In an interview with Digiday, you spoke of ‘a massive proliferation in technology in a couple of years which will coincide with the U.S. general election’. Do you think the US media is ready to take on this challenge? What can medium- and small-sized newsrooms do to be better prepared?

Media forensics experts predicted that the Midterm Elections 2018 would be the first major political event in the U.S. that would entail the spread of deepfakes on social media. That was not the case. But we have to stay vigilant. Our goal is to be prepared when deepfakes become prevalent. The Wall Street Journal is one of the first newsrooms to tackle the looming threat that deepfakes pose, but we are seeing heightened awareness at other news organizations, too. Reuters for example is training its reporters in deepfake detection by showing them examples of synthetic videos that the news agency generated in collaboration with a startup. NYT launched the provenance project and the Washington Post developed a guide to manipulated video.

However, there’s still a massive knowledge gap between large newsrooms like WSJ, NYT or WaPo and small news organizations when it comes to DeepFakes. One of the solutions is collaboration and having the news industry come together as one to address this challenge. Our team at the Journal has published a deepfake guide on Nieman Lab. It is important to us to advance our own understanding of the issue, but also to share best practices with the rest of the news industry.

What are your thoughts on media forensics training, an understanding of the underlying technologies behind the various tools used and the ability to detect hidden or deleted data? Do you think it should be part of all journalists’ education? How can the curricula keep up with the fast developing technology?

At the moment, it is still relatively easy to spot a deepfake video, if you know what to look for. Basic video manipulation and digital research skills are most likely enough to recognize most altered videos. However, we have to keep up with the advancements in deepfake technology and constantly update our detection methods. Some startups and tech companies are already preparing for more intricate deepfakes by developing automatic detection software based on Machine Learning, which will help social networks and newsrooms spot altered videos and photos faster and more reliably. Understanding deepfakes should be part of the training program of newsrooms. However, misinformation has always existed and it’s not going away anytime soon.

Some universities are also being proactive in educating its students, and understand this as a broader media literacy issue. For example, the Missouri School of Journalism is running a student competition to develop deepfake detection tools while New York University has a media literacy class called “Faking the News”, where students learn (among other things) how to create deepfakes.

Most of the technology that could aid deepfake verification is still not available publicly or inaccessible to the workflow of newsrooms. Besides, the more deepfake detection algorithms are developed, the better will deepfake creators know how to improve the technology. Do you see any upcoming solutions to resolve this issue?

Most of this research is still in early stages. We can see that many universities are focused on finding a verification method, like UC Berkeley or SUNY. The Department of Defense is also interested in finding forensics techniques: the Defense Advanced Research Projects Agency (DARPA) has some programs to address media verification.There are also some startups and IT security companies trying to tackle this issue, including: Deeptrace, Amber and ZeroFox.

Image Source: DeepTrace

Tech companies are also starting deepfake detection initiatives. Google has shared a dataset of deepfake videos, which allows researchers to test their detection tools. Facebook announced a deepfake detection challenge and will release a similar dataset. The company will fund this effort with $10 million and give out cash prizes to researchers that come up with best detection methods.

Image credit: Google. A sample of videos from Google’s contribution to the FaceForensics benchmark. To generate these, pairs of actors were selected randomly and deep neural networks swapped the face of one actor onto the head of another.

Tech giants are increasingly investing in deepfake detection. Given their tech and financial resources, do you worry about a handful of tech companies having the monopoly over the control of deepfakes?

We are going through a period where the public is becoming less positive about the impact of big tech companies, and naturally journalists may be wary of using deepfake detection tools developed by these giants. But deepfakes seem to be a concern for these companies as well and we see them releasing training datasets that could be helpful for creating a detection tool. This is a great first step. However, I imagine that journalists may be a bit more comfortable using independently developed or open source tools developed internally that leverage these and other data sets.

Tools that were developed by tech companies for innocuous purposes eventually ended up being used for dis/malinformation. How can this risk be mitigated in the future? What is the responsibility of the companies in addressing it?

It’s important for tech companies to consider following ethical & transparency guidelines and put processes in place that attempt to prevent their technology to be used to spread misinformation. One possible solution for companies to mitigate the risk of video and image manipulation tools is to introduce a watermarking or blockchain system that makes it easy to detect if content has been altered using specific software.

Deepfake is not an issue only for video, but also for audio, which is arguably even harder to tackle. Some are now also warning against ‘fake text’. What are your thoughts on the particularities of deepfake for different media types?

Each media format — video, audio, text, images — has its own particularities and there are different detection tips depending on each one. Fake text is very difficult to catch. A well known algorithm is the GPT-2 model, created by Open AI. This model was trained to predict text and can also translate between languages and answer questions. Because of the potential misuse of this technology, Open AI decided to release its model in stages. One potential misuse could be, for example, generating misleading news articles. In fact, research from Open AI partners found that people find synthetic text almost as convincing as real articles. Here’s a site that let’s you test the GPT-2 model: https://talktotransformer.com/ . Using that tool you can give a prompt and the machine comes up with fake text. For example, by giving the prompt “Global Editors Network is…” it generates the following copy:

You give a prompt to the tool and the machine comes up with fake text using the GPT-2 model algorithm.
Another example of how convincing synthetic text can be is This Marketing Blog Does Not Exist, a blog fully generated by AI.

What do you think about the attempts to regulate deepfake, particularly the DEEPFAKES Accountability Act in the US, which would require any deepfake author to disclose that the material has been altered?

At the moment, there’s no specific federal legislation that directly addresses the deepfake issue. In some cases, platforms have taken videos down on grounds of copyright. Existing legal framework for harassment, defamation, or rights of publicity may also apply. But there isn’t a specific legal framework that considers deepfakes. Transparency disclosures can be helpful in these situations, which is something that the DEEPFAKES Accountability includes.

Some states have passed laws that are aimed at reducing the harm of deepfakes: Distributing deepfake revenge pornography is now a misdemeanor in Virginia, for example. And a new law in Texas criminalizes creating and sharing a deepfake with the intent of influencing the outcome of an election. We’ve also seen state and local governments regulating AI in some way: California has a Bot disclosure law.

Some argue that the proliferation of deepfakes and its byproduct — increasing mistrust of news — can be an opportunity for legacy media. They could increase their audiences as ‘trusted validators and assessors’. What are your thoughts?

From a transparency point of view, in the age of AI, journalists have an increasingly important responsibility to keep algorithms in check. In fact, we published a guide to teach journalists methodologies to conduct algorithm transparency reporting.

In terms of trust, deepfakes make it harder for news organizations to verify third party materials. We don’t only ask if a video is fake, but also how can we prove that a video is true. This challenge also presents an opportunity for news organizations. When people distrust content published on social networks, the news industry has an obligation to provide credible information they can’t find on those platforms.

Do you foresee the development of deepfakes in real time or live streamed?

We are not there yet, mainly because of limitations to computing power and network bandwidth. However, we have already seen the first examples of real-time deepfakes, similar to Snapchat or Instagram filters, like MIT Technology Review Editor-in-Chief Gideon Lichfield interviewing himself in the role of Vladimir Putin. With quantum computing and 5G proliferation we will undoubtedly get to a point where simulations get very close to reality.

Attend a GEN study tour and stay in the vanguard of the field!

Francesco Marconi is the R&D Chief at the Wall Street Journal and was previously AI co-lead at the Associated Press. Francesco’s forthcoming book “Newsmakers: Artificial Intelligence and the Future of Journalism” will be published by Columbia University Press in January 2020.

Global Editors Network

The Global Editors Network (GEN) was the worldwide association of editors-in-chief founded in 2011. It ceased its activities in November 2019 due to lack of sustainable finances.

Ana Lomtadze

Written by

Associate Program Specialist at UNESCO (CI sector) Formerly at Global Editors Network & Open Society Foundations.

Global Editors Network

The Global Editors Network (GEN) was the worldwide association of editors-in-chief founded in 2011. It ceased its activities in November 2019 due to lack of sustainable finances.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade