AI Voice Cloning: A Double-Edged Sword

Artificial intelligence (AI) has revolutionized many aspects of our lives, offering new possibilities and conveniences. However, it also brings forth significant challenges, particularly in the realm of voice cloning. This technology can replicate anyone’s voice with stunning accuracy, posing both opportunities and threats.

What is AI Voice Cloning and Why Is It Important

AI voice cloning is all the rage now — learn all about its importance, benefits, and fields of use in this article.

What Is Voice Cloning?

Voice cloning is not a brand new technology as it was pioneered in 1998. Both then and for a long time afterward, it required certain technical skills, powerful hardware, and long voice samples to do its job. Voice cloning is a process of replicating a person’s voice in digital form by training neural networks on an audio sample. The cloned voice can later be used to voice over any amount of video and/or audio content.

What Are the Benefits of Voice Cloning Technology?

While there are certain ethical concerns related to voice cloning technology, it also offers many benefits. Here are the principal ones:

Voice Cloning in Publishing

Audiobooks are steadily gaining in popularity. This is because they enable people to keep up with reading while doing other things. Audiobooks read by authors are especially popular with readers. Voice cloning enables publishers to free the author from numerous recording sessions by simply recording a sample of their voice.

Optimize Talent Availability

You might want to voice your corporate videos in the CEO’s voice, but they definitely don’t have the free hours to spend in the recording studio. Cloning their voice is an excellent way to ensure that you have it available when you need it. This optimization benefit works for any kind of voice talent that you work with, ensuring that you have the required voice even when the talent behind it is busy.

Revive Historical Voices

Imagine watching a documentary about Britain’s part in WWII narrated by Winston Churchill or a biopic of Alfred Hitchcock told in his own voice. With AI-powered voice cloning, reviving historical voices from the existing recordings has become a reality.

Broaden Content Diversity

Voice cloning helps to broaden your content diversity. It enables you to create live weather or sports updates in the presenter’s voice without actually involving them.

Localize in New Languages

Do you need to spread the corporate message to your offices in other countries? Want to expand your audience as a video or podcast creator? Make your media content available to viewers who don’t speak the official language? Voice cloning is there to help you. With new advances, you can use a cloned voice to dub the same content in different languages — without the thick accent!

The Threat to Voice Actors

Voice actors are particularly vulnerable to the rise of AI voice cloning. The Australian Association of Voice Actors (AAVA) estimates that the jobs of around 5,000 local voice actors are at risk. AI-generated voices are already replacing humans in commercials, audiobooks, and video games. While high-profile actors like Scarlett Johansson can take legal action against the unauthorized use of their voices, everyday actors face the real threat of job loss.

Simon Kennedy, president of AAVA, emphasizes the need for fair rules and protection against voice misuse by AI. The advent of AI has prompted the formation of groups like AAVA to advocate for the rights of voice artists. Kennedy warns that the perceived cost-saving measures companies believe they will achieve with AI may backfire, as the lack of human connection in AI-generated audiobooks, for example, could lead to a loss of consumer interest.

High-Profile Examples: Morgan Freeman

High-profile individuals like Morgan Freeman are also at risk. Freeman’s distinctive voice is easily recognizable and widely beloved, making it a prime target for misuse. Scammers could use AI to clone his voice and create fake endorsements, misleading the public and potentially harming his reputation.

The Rise of Voice Cloning Scams

Voice cloning technology isn’t just a threat to voice actors and celebrities; it’s also becoming a tool for scammers. Recent advancements in AI have enabled a surge in AI-generated media, presenting both opportunities and risks. Among these risks are AI voice cloning scams that exploit personal connections. Scammers can now replicate voices with startling accuracy using just a few seconds of audio, posing as loved ones in distressing scenarios like emergencies or financial troubles. This technology has been used to impersonate family members, with victims receiving convincing messages urging immediate action. To protect against such scams, it’s crucial to verify requests directly through trusted channels and remain vigilant against unexpected urgent appeals that could be fraudulent. As AI continues to evolve, awareness and caution are essential defenses against these increasingly sophisticated forms of deception.

According to the Federal Trade Commission, business imposters using cloned voices have been on the rise, resulting in over $752 million in losses last year. The process of cloning a voice is alarmingly simple: a 30-second audio clip and a small fee are all it takes to generate a convincing AI voice.

One of the most alarming cases involves Gary Schildhorn, an attorney experienced in fraud prevention, who nearly fell victim to a voice cloning scam. He received a call from someone who perfectly mimicked his son’s voice, claiming to be in legal trouble. Schildhorn was on the verge of withdrawing $9,000 to help his son before realizing it was a scam.

How Voice Cloning Works: An Example

To understand how voice cloning works, let’s break down the process with an example:

Step 1: Source Audio Clip

A scammer needs a 30-second audio clip of the target’s voice. This can be sourced from social media posts, voicemail greetings, or recorded phone calls.

- Example: The scammer finds a video on social media where John, the target, is speaking.

Step 2: AI-Powered Website

The scammer visits an AI-powered voice cloning website. These platforms are easily accessible and often require a small fee.

- Example: The scammer pays $5 to access the voice cloning service.

Step 3: Upload and Clone

The scammer uploads John’s 30-second audio clip to the website. The AI processes the clip to create a digital model of John’s voice.

- Example: The website takes about 2–4 minutes to generate a clone of John’s voice.

Step 4: Generate Fake Messages

The scammer types the desired message into the website’s interface. The AI uses the cloned voice to produce audio of John saying whatever the scammer types.

- Example: The scammer types, “Mom, I’m in trouble. I need you to wire me $5,000 immediately.”

Step 5: Execute the Scam

The scammer uses the generated audio to make a phone call or send a voice message, tricking the target into believing it’s John.

- Example: John’s mother receives a call and hears “John’s” voice urgently asking for money. Believing her son is in danger, she wires the money without question.

Targeting the Elderly: A Vulnerable Group

Elderly individuals are particularly vulnerable to voice cloning scams due to their trusting nature and potential lack of familiarity with advanced technology. For instance:

Scenario: Elderly Woman Targeted

Mary, an elderly woman living alone, receives a call from someone who sounds exactly like her daughter, Jane.

- Step 1: The scammer obtains a 30-second audio clip of Jane’s voice from a public social media post.

- Step 2: They use an AI-powered website to clone Jane’s voice for a small fee.

- Step 3: The scammer generates a message in Jane’s voice, saying, “Mom, I’m in trouble. I need your bank account information to transfer some emergency funds.”

- Step 4: Mary, recognizing the voice as her daughter’s, panics and provides her bank account details.

This example illustrates the ease with which scammers can exploit vulnerable individuals, leading to potentially devastating consequences.

Make sure that the company/application you choose subscribes to the ethical use of voices and guarantees the protection of your data.

Regulatory Measures

Congresswoman Yvette D. Clarke, a member of the U.S. House of Representatives AI task force, has introduced legislation to address the challenges posed by AI voice cloning. The proposed laws would require companies to watermark AI-generated content, ensuring transparency and accountability. Clarke emphasizes the urgency of these measures, as the average person may struggle to discern between real and AI-generated voices.

Industry Support for Regulation

The startup used by CBS New York’s Mahsa Saeidi to generate synthetic audio supports the introduction of regulations like digital watermarking. They believe it will help establish the origins of digital content and provide a layer of protection against misuse. The company is actively working on creating technology to identify synthetic audio, aiming to set industry standards.

Factors to Consider When Choosing a Voice Cloning Application

As advances in AI technology have made voice cloning a standard procedure, there are many apps and software solutions on the market that offer such services to you. Here is the list of things to consider before making your choice.

Output Quality

The desired quality of output depends on your needs, of course, but the idea is that the cloned voice has a natural sound. While everybody on the market promises human-like digital voices that get the nuances of your voice right, checking for quality before you commit to the specific provider is a good idea. If the company doesn’t have any samples on its website, request a demo.

Intuitive Interface

Is it easy to use the app? How intuitive is the user interface? Will you be able to figure it out on your own? To determine this, watch the app’s product videos and demos.

Voice Protections

When you clone your voice, the most important consideration is the protection of your data. You want to be sure that the copy of your voice is securely stored and that the company that cloned it does not use it for any other purposes. MakeThe startup used by CBS New York’s Mahsa Saeidi to generate synthetic audio supports the introduction of regulations like digital watermarking. They believe it will help establish the origins of digital content and provide a layer of protection against misuse. The company is actively working on creating technology to identify synthetic audio, aiming to set industry standards.

Industry Support for Regulation

The startup used by CBS New York’s Mahsa Saeidi to generate synthetic audio supports the introduction of regulations like digital watermarking. They believe it will help establish the origins of digital content and provide a layer of protection against misuse. The company is actively working on creating technology to identify synthetic audio, aiming to set industry standards.

Factors to Consider When Choosing a Voice Cloning Application

As advances in AI technology have made voice cloning a standard procedure, many apps and software solutions on the market offer such services to you. Here is the list of things to consider before making your choice.

Output Quality The desired quality of output depends on your needs, of course, but the idea is that the cloned voice has a natural sound. While everybody on the market promises human-like digital voices that get the nuances of your voice right, checking for quality before you commit to the specific provider is a good idea. If the company doesn’t have any samples on its website, request a demo.

Intuitive Interface Is it easy to use the app? How intuitive is the user interface? Will you be able to figure it out on your own? To determine this, watch the app’s product videos and demos.

Voice Protections When you clone your voice, the most important consideration is the protection of your data. You want to be sure that the copy of your voice is securely stored and that the company that cloned it does not use it for any other purposes. Make sure that the company/application you choose subscribes to the ethical use of voices and guarantees the protection of your data.

The Future of Voice Cloning Technology

The future of voice cloning technology is unfolding as we speak. Powered by deep learning neural networks, cloned voices can already voice texts in languages other than the original. The number of languages such voices can cover will grow, as will the accuracy. Soon, we can expect cloned voices to be able to voice texts in other languages without accents.

In addition to dubbing movies, voicing over corporate and entertainment content, and being used in educational settings, cloned voices can power personal virtual assistants and staff customer service call centers. Applications also exist in gaming and interactive storytelling.

Vidby Expertise

The company’s founders and experts have over 10 years of experience in working with speech technology and AI. Vidby is YouTube’s Recommended Vendor for AI-powered content localization. It has rolled out a library of more than 2,500 AI-generated voices in 75 languages. Its R&D team is actively working on speech synthesis-related issues to make the synthesized speech as nuanced and emotional as human speech.

Vidby’s voice-cloning tool is cross-lingual, meaning that after cloning your voice, you can use the company’s services to voice over your videos in eight languages (with 20+ more coming very soon). The company subscribes to all the rules of ethical conduct regarding AI and voice technology, ensuring that your cloned voice is safe with them.

Conclusion

For 25 years, voice cloning was a gated technology that required skills and hardware beyond the reach of regular users. Recent AI advances have broken those barriers, making AI-powered voice cloning available to virtually anyone. What seemed like science fiction a couple of decades ago has now become a reality for millions of people. Voice cloning offers new opportunities in many content-heavy industries, from education to gaming. However, AI voice cloning is a double-edged sword because its proliferation raises concerns of ethical use, data privacy, and authenticity. Navigating this new terrain to avoid pitfalls and reap benefits will be an ongoing challenge, but the potential rewards include a complete overhaul of human-to-human and human-to-computer communication.

While regulators are working to catch up with the rapid evolution of voice cloning technology, individuals must take proactive steps to protect themselves:

- Create a Code Word: Establish a secret code word with family members that only they would know. In case of an emergency call, ask for the code word to verify the caller’s identity.

- Be Skeptical: If someone calls asking for money or sensitive information, pause and double-check their story. Contact the person directly through a known number or another trusted method.

- Limit Public Exposure: Be mindful of the audio and video content you share publicly. Restrict access to your social media accounts and avoid posting extensive audio clips.

- Use Secure Communication: Encourage family and friends to use secure communication methods that are less susceptible to interception and recording.

--

--