Voice Cloning AI in Mendix

The Voice cloning AI model enables the user to create an artificial simulation of a person’s voice.

Karthikeyan Gopalan

Published in

Mendix Community

5 min readApr 13, 2023

Voice Cloning AI in Mendix Banner Image — a woman opposite their robot doppelganger

Yes, the future is here!!

In this age of technological evolution, the revolutionary technology of speech synthesis surely makes heads turn! The Eleven Labs AI Speech platform presents an incredible product that allows the user to train any voice model using just 60 seconds of a person’s voice. This can then be synthesized using text prompts that are entered by the user.

To clone a voice you will need a recording of a person’s voice which is 60 seconds to several minutes long. The recording is then used to create a digital model of their speech patterns. The model is then fed into a text-to-speech engine that can convert text into speech that sounds like the original speaker’s voice. The technology uses deep learning algorithms to analyze the audio data and extract the nuances of the speaker’s voice, including tone, pitch, and pronunciation.

Now, we shall see how this Voice Cloning functionality is developed in the Mendix Platform.

Implementation on Mendix

The level of accuracy to which the Voice Cloning technology can convert a person’s voice is truly remarkable. This awe-inspiring technology has been integrated with Mendix using the Eleven Labs APIs.

What do we need?

· Play Audio Marketplace Module by Clevr— https://marketplace.mendix.com/link/component/120804

· API Key — From Eleven Lab Profile section

Getting Started

Got to Eleven Labs and create an account.

ElevenLabs || Prime Voice AI

Edit description

beta.elevenlabs.io

On Eleven Labs, go to the Profile section where you are able to find the API key. The below screenshot explains how we can get the API Key that needs to be used in Mendix.

From the Voice Lab home page, we can train or synthesize the voice that we want.

In Studio Pro

I have cloned about 8 voices here. Each voice contains about 60 seconds of the corresponding audio clip.

From the Mendix perspective, we can call a REST API to GET all those voices. Add the data view to the page and call the below microflow in the data source, which calls the REST API to Eleven Labs

Data Source Microflow which gets all the available voices

To GET all the voices that have been cloned in a customizable manner, perform the REST activity configure as below:

Call GET HTTP Method with the mentioned URL

As the next step, we must create the Import mapping and JSON structure to get the Response

JSON Structure

Once everything gets executed, we will be able to get all the voices that have been configured inside the Overview Page of your Mendix App as shown below:

List of custom-cloned and pre-cloned voices

Here, we can find some sample voices that are already present in Eleven Labs. As seen in the above screenshot, we have added some of the cloned voices which will be easy to identify. With the help of the Play Audio JavaScript Action, we have incorporated logic in the nanoflow to play the cloned voice returned from Eleven Labs after calling a POST REST Activity inside the sub nanoflow.

Inside the nanoflow, a microflow has been added to use the POST REST Service to generate and stream the cloned voice with the user's supplied text.

POST Method and the End Point URL to stream the audio

Make the response “stored” in a file document

Once it is all set pass the object into the Play Audio activity, which has been used in the Nanoflow to stream the expected voice output.

Finally, the cloned voice will look like this:

The Voice which has been chosen is Elon Musk’s, and the speech synthesis is as follows:

For API Reference — https://api.elevenlabs.io/docs

Conclusion:

Listen to the concluding line in Matthew McConaughey’s voice!

ElevenLabs || Prime Voice AI

Edit description

beta.elevenlabs.io

Introduction - ElevenLabs

Learn how to generate speech from text

docs.elevenlabs.io

ElevenLabs || Prime Voice AI

Edit description

beta.elevenlabs.io

From the Publisher -

If you enjoyed this article you can find more on our Medium page. For great videos and live sessions, you can visit our library of on-demand videos or our community Youtube page.

For the makers looking to get started, you can sign up for a free account, and get instant access to learning with our Academy.

Are you interested in getting more involved with our community? Join us in our Slack community channel.

Voice Cloning AI in Mendix

The Voice cloning AI model enables the user to create an artificial simulation of a person’s voice.

Yes, the future is here!!

Implementation on Mendix

What do we need?

Getting Started

ElevenLabs || Prime Voice AI

Edit description

In Studio Pro

Conclusion:

Read more

ElevenLabs || Prime Voice AI

Edit description

Introduction - ElevenLabs

Learn how to generate speech from text

ElevenLabs || Prime Voice AI

Edit description

Written by Karthikeyan Gopalan