Simulacra at Scale
A Short History of the Audiovisual Fake
By Data & Society Media Manipulation Researcher Britt Paris
In the summer of 2018, videos of Barack Obama giving previously unreleased addresses appeared on YouTube. The catch: These events never actually happened. University researchers had uploaded these videos to demonstrate the effective use of neural networks and generative adversarial networks to turn audio or audiovisual clips into realistic, but completely fake, lip-synced videos.
With consumer-grade apps like FakeApp and FaceSwap, you don’t need to be a computer scientist to create convincing fakes. We’re already seeing misogynist and racist GIFs of Michelle Obama performing a striptease and an antisemitic video of Adolf Hitler’s face grafted onto Mauricio Macri’s in a public speech. These artifacts belong to a rapidly growing genre of videos produced through various modes of artificial intelligence, colloquially referred to as “deepfakes.” Here, I define deepfakes as the use of machine learning algorithms to manipulate audio, photo, and video files to simulate reality.
While these videos are not yet believable, they foreshadow serious problems to come. But first, we need to understand the history of these technologies and how they are rooted in performances of professionalism, play, politics, and panic. In this post, I provide an overview of the history of deepfakes and consider the societal implications of audiovisual manipulation. I argue that both the technology and its proposed solutions can be wielded as political tools.
I define deepfakes as the use of machine learning algorithms to manipulate audio, photo, and video files to simulate reality.
Film, Context, and Audiovisual Manipulation
For many years, audiovisual manipulation had primarily been the domain of the film industry. From Georges Méliès’ A Trip to the Moon (1902) to the computer generated graphics of Star Wars: Rogue One (2017) that brought young Princess Leia back one last time, the production of fantasy, parody, and speculation was limited to entertainment industry professionals. For fictional films, the more realistic the manipulation the better. But now, the tools of entertainment have become widely available to the public, positioned as creative technologies.
Henry Jenkins believes in the power of creativity for political critique. In a 2004 MIT Tech Review article, he writes about “Photoshop for democracy” in which “participatory culture becomes participatory government.” He cites a set of manipulated videos and images that criticize political figures, including two images of the Three Stooges whose faces have been replaced by those of politicians. Jenkins argues that within this context, the public is using Photoshop as a tool to produce meaningful criticism and hold elected officials accountable. Photoshop can enable those who feel excluded from political discourse to express political ideas.
But what happens when people believe creative audiovisual manipulations are real evidence and act upon them? For example, the Lumière brothers’ 1895 The Arrival of a Train at La Ciotat was one of the first motion pictures ever publicly screened. Newspapers described how people clamored to get out of the way as the steam train chugged towards them. Nearly half a century later, Orson Welles’ 1938 The War of the Worlds took the form of a fictional radio report and caused thousands of Americans to flee their homes because they believed a Martian invasion was afoot. Moreover, the former Soviet Union has notoriously engaged in politically motivated photo manipulation to alter the historical record.
In these cases, video and audio perform the role of evidence and simulate an unsubstantiated reality. The audience of War of the Worlds was at home listening to a simulated newscast on the radio for the first time, with no context. When the viewing context is collapsed, it creates panic. I call this performance simulacra at scale.
Social Media, Verification, and Audiovisual Manipulation
With social media, the audiovisual artifact is disseminated at a higher speed and larger scale than ever before. This presents a new and exciting challenge for media makers, but also poses daunting questions for verifying media artifacts, especially when the crowd is called upon to witness an atrocity.
With the advent of streaming and instant sharing of high definition video, the mobile phone camera became an important tool for distributed witnessing of everyday events. In the last few years, there have been several cases in which the broader public, with the new technological capacity to witness events via smartphones, has attempted to leverage the evidentiary qualities of video to hold the powerful accountable in the ways Jenkins describes. But, as we have seen in recent years, these videos are often only considered evidence within hierarchies of established power.
While creative technologies can be tools for participatory democracy, they can also be used to break it down. Historically, we’ve seen audiovisual artifacts used to create a simulated reality where context is collapsed and panic ensues. In my next post, I will show how social media, coupled with interpretations of evidence, often reinforce preexisting power structures at the expense of the underrepresented. We need to consider solutions to the threat of deepfakes in the context of power.
Stay tuned this spring for a Data & Society report that investigates the history of audiovisual fakes developed and wielded for different purposes and outcomes. It will present ways to think about how contemporary fakes function within the speed and widespread reach of the contemporary networked information landscape.