The Deepfake Tango
Did you hear about Drake’s new album? Neither did Drake.
Artificial Intelligence has made its way into the music industry, revolutionising the creative landscape. It is advanced enough now so that it can generate realistic vocal imitations of artists, enabling the production of songs that seem to be performed by famous artists all over the world. Holly Herndon, like many others, has jumped on the AI bandwagon for her music, exploring uncharted territories. Deepfakes, when used ethically, have the potential to revolutionise the entertainment industry, offering realistic visual effects and enabling creative storytelling without the constraints of casting or location.
When Reality Becomes Uncertain
While offering exciting possibilities for collaborations in the entertainment industry, Deepfakes render authenticity an elusive concept, potentially leading to large scale misinformation. For instance, In 2020, two political ads featuring deepfake versions of Russian President Vladimir Putin and North Korean leader Kim Jong-un were created as part of a campaign by the nonpartisan advocacy group RepresentUs. The fake dictators conveyed that America did not need any foreign interference because it was capable of ruining its democracy on it’s own. Despite it’s alarming appearance, the campaign aimed to emphasize the importance of voting rights during the U.S. presidential election and to awaken Americans to the fragility of their democracy. The ads were initially intended to air on major networks but were unexpectedly pulled, given the sensitive nature of using deepfakes in a political context, it’s possible the networks felt the public wasn’t ready.
However, it’s not just the world of politics that’s under threat. Deepfakes have been weaponized in a more sinister and personal way. The adult entertainment industry has seen an influx of non-consensual deepfake pornography, where the faces of celebrities, ex-partners, and even ordinary individuals are superimposed onto actors in explicit videos. This misuse of technology is not only a gross violation of privacy but can lead to lifelong trauma for the victims. Actress Scarlett Johansson, a frequent target of such manipulated videos, remarked in an interview that “trying to protect yourself from the internet and its depravity is basically a lost cause.” This growing issue forces us to confront not just the capabilities of this technology, but also the moral considerations that accompany it.
In another alarming high profile case, a UK energy firm fell victim to a deepfake scam, losing $243,000 in the process. The scammer used deepfake tech to imitate the voice of the CEO from the energy firm’s parent company. The CEO of the energy firm was so convinced by the authenticity of the call that he mentioned recognizing the distinct German accent and the unique “melody” in the voice, leading him to transfer the funds within an hour. The methods employed by the scammer to train the AI to such precision remain unclear, but the fact that this is a case from 2019 is truly scary, because 4 years is worth decades of tech advancement in this age.
Much like novelists and their wild imagination, AI algorithms can now spin their own tales — only theirs feel so real, they can put even the best fiction writers to shame. Just when you thought that convincing your grandma that the ‘alien invasion’ video she shared on Facebook is fake was a huge task, in comes the era of deepfakes. These aren’t your everyday, run-of-the-mill hoaxes. They could have you convinced that the government announced it’s plan to colonize Mars next Tuesday, or that Playboi Carti makes good music.
The Deepfake-Busters
But before you decide to go on a digital detox, there’s a glimmer of hope on the horizon —
The Coalition for Content Provenance and Authenticity (C2PA)
It’s like the reliable neighborhood watch for the online world. Formed by the tech heavyweights, C2PA aims to create a metadata framework. Metadata is like a ‘behind-the-scenes’ file that accompanies the content, recording vital information about its origin and alterations. Software and online platforms can then examine this metadata to validate the authenticity of the content. So with this origin tracking, we could tell if that image of an astronaut cat really came from NASA (spoiler: it probably didn’t).
Now, what does it mean for our beloved memers? It’s certainly not a death sentence, but it does mean they need to be more careful about their content’s journey. If a piece of content gets flagged as potentially misleading or manipulated, C2PA’s standards can help sniff out the truth by following the breadcrumbs left by its digital ID, which could prove to be trouble for the content author.
While the C2PA focuses on setting global standards for content verification, startups like Sentinel are taking a more hands-on approach.
Sentinel
This Estonian startup employs a multi-level approach to deepfake detection. Initially, they use a hashing technique, creating and comparing cryptographic hashes of known deepfakes against incoming content to quickly identify matches. Simultaneously, they employ machine learning models to scan the metadata embedded within media files, searching for inconsistencies or patterns that might indicate tampering.
On the audio front, advanced signal processing techniques identify synthetic voice signatures or anomalies arising from voice manipulation algorithms. For video, especially facial regions, frame-by-frame analysis is executed, using deep learning to detect subtle visual inconsistencies that human eyes might miss. By combining data from these diverse detection layers, Sentinel calculates a composite score, representing the likelihood of the media being a deepfake.
While Sentinel is drawing attention for its prowess in AI generated content detection, there’s another emerging champ in this space:
GPTZero
Instead of getting lost in a sea of words, GPTZero simplifies things by converting these words into numbers. Edward Tian, the founder of GPTZero, is a braver man than me for preferring numbers over words.
Diving into its mechanics, GPTZero hinges on two aspects: perplexity and burstiness.
Now, perplexity is connected to ‘entropy’. In the realm of machine learning, entropy isn’t about heat or energy, but rather gauging unpredictability. Imagine reading a book where you can’t guess the next word; that’s high entropy. Conversely, a predictable book, where you often know what comes next, has low entropy. GPTZero gauges this unpredictability in sentences. For instance, if AI finishes the statement “The sky is…” with “blue”, there’s hardly any surprise — low entropy. But if it bizarrely said “tasting like strawberries”? That’s an unexpected twist, and our entropy shoots up, indicating a possible human touch.
The other aspect, burstiness, measures how often these unpredictable moments appear throughout a text. Humans, with our dynamic minds, tend to have fluctuating patterns. AI, however, often remains on a steadier path. This contrast helps GPTZero discern if it’s creativity or machinery at work.
The Epilogue
“Technology met me at a very strange time in my life”
So, while AI might be getting crafty with its fakery, thanks to C2PA, Sentinel and GPTZero, we’re learning to be smarter viewers. After all, why just be a passive audience when you can be a discerning critic, right?
But let’s not forget, as advanced as these tools become, they are continually playing catch-up. AI models evolve, becoming more sophisticated in mimicking human-like patterns. It’s an ongoing battle, with both sides upping their game. Each improvement in AI detection is met with an equally innovative advancement in AI content generation.
This really makes you wonder and question the extent to which these fake videos, audios, really any form of “art”, will have the power to distort our understanding of what’s real, boiling it down to one question: how much of our reality will be left untainted by this technology?
All I know is, there could not be a more exciting time to be alive.