Audio Deepfakes- What are the pros and cons?

Savneet Singh
4 min readFeb 19, 2024

--

Audio Deepfakes(ADs) have been exploited to compromise public safety, even though they were first offered as audiobooks to improve people’s lives. Audio deepfakes have been used to spread misinformation all over the world, and because of their malicious use, people are afraid of “Audio Deepfakes” (ADs).

ADs can be used for political propaganda to influence public opinion, or they can even be used for terrorism. Every day, enormous volumes of voice recordings are shared online, making it difficult to distinguish fake content from original ones. With technological advancements, anyone can easily obtain audio deepfakes or audio manipulations on their mobile devices or home computers. This clearly means that deepfakes can influence election results by targeting organizations, political leaders, and the government.

Audio Deepfakes- Harmless intentions

There are a variety of intentional uses for ADs or voice generators that can significantly reduce manual effort and save numerous hours of work. These tools can be utilized in a wide range of applications, such as:

  • creating podcasts
  • generating transcripts
  • producing audiobooks
  • assisting businesses and clients in scheduling appointments, receiving reminder calls, and obtaining business information such as hours of operation, etc.

Google actually calls for you to make appointments; yes, you got it right! Google Duplex, which was launched in 2018, can help you schedule certain types of appointments. The AI assistant will follow a specific script and actually listen and respond accordingly. In this article, there are several voice samples[1] (you should check them out!). They sound pretty realistic, too. It uses informal words just like human speech, like “um,” “uhh,” and “mhmm. Businesses, on the other hand, businesses can also use Duplex to assist customers, helping them with business hours, appointment scheduling, making reminder calls, and so on.

The ethical problem here is that the person on the other end of the phone should be aware that they are not talking to a real person. Additionally, these calls can also be used to spam and scam people if there aren’t any safeguards around them.

Malicious actors exploit this technology, going above and beyond to scam people with their sinister intentions.

One of the shocking instances of deepfake imitation audio fraud led to a loss of US$243,000 from a U.K.-based energy company[2]. The CEO received a call that mimicked the voice of the chief executive of the company’s Germany-based parent company to facilitate an illegal fund transfer. But in reality, it wasn’t the parent firm. It was someone who was scamming him. The interesting part is that the caller had a German accent, which was a little difficult to get with deepfake audio a few years ago. But as time passes, the deep fakes get better and better. This particular instance is being investigated to learn the truth about whether it was actually a deepfake call.

“Audio Deepfakes” (ADs)can be used for political propaganda to influence public opinion, or they can even be used for terrorism, affet elections.

The audio deepfake in Anthony Bourdain’s voice, which was pulled from his original audio clips in the documentary “Roadrunner: A Film About Anthony Bourdain in 2021, this brings up an ethical issue: Should you be able to synthesize and use someone’s voice without their consent, especially if they are not in this world anymore?[3]

Another very famous and most discussed audio deepfake (AD) is the synthetic voice-generated song featuring Drake and The Weeknd’s voices, which was shared on TikTok by ‘Ghostwriter977’.

This instance of deepfake raises another ethical and fairness question. If you don’t have copyright over the’sound’ of a voice, then how is this illegal or copyright infringement?

This song was not stolen or copied from either Drake’s, Weekend’s, or anyone else’s songs. The vocal quality of these is not owned by Universal Music Group, but it brings up another question, which is training the machine learning model on tons of material related to these two artists in order to create the finished product.

One of the arguments in its favor is training the machine learning model on the material of these two artists, which is widely available to the public. This argument is somehow unjustified. Different people in our lives inspire all of us. That art is about getting inspiration or influence from what we see, hear, and feel around us. So in this particular case, how is getting inspiration or being influenced by an identifiable artist not right?

With artificial intelligence, the only difference is that the final product or the simulation is close to perfect — a finished product that may not have been imagined before.

Another application of ADs is voice cloning related scams. I have shared some facts around Voice cloning-related scams in my article here. It shares details around how to recognize and voice scams and protect yourself.

Let me know your thoughts and experiences related to Audio Deepfakes here in the comments.

Drake, Weekend, music scam, tiktok. The audio deepfake in Anthony Bourdain’s voice, which was pulled from his original audio clips in the documentary “Roadrunner:

Resources-

  1. Google Duplex: An AI System for Accomplishing Real-World Tasks Over the Phone
  2. Unusual CEO Fraud via Deepfake Audio Steals US$243,000 From UK Company

3. A Haunting New Documentary About Anthony Bourdain

--

--

Savneet Singh

Learning Experience Architect by profession and AI Ethicist by passion