The demand for Text to Speech (TTS) is increasing and the use case is extending beyond screen readers and assisted reading. In the last few years, machine learning has remarkably produce synthetic voice closer to human speech. This improves the experience for listeners, however, it doesn’t entirely hide the uncanny robotic voice, but in the very near future, you will hardly tell the difference between a machine and human.
TTS voices can also produce consistent pronunciations. This can help language learners. TTS with machine learning is expensive in the long run for screen readers. However as the cost of computing power decreases, in the very near future, smartphones and laptops will have more natural screen readers prebuilt in the software. Dedicated machine learning chips will become the norm to generate natural voices. This will allow more people will have access to high-quality natural voices. The current screen readers in Macbook OS and Windows still sound robotic and monotonous. This still many irritate some people and ruin their internet browsing experience. Conversely, natural voices will provide vastly improved user experience.
Smart home devices; they’re talking too. Home appliances can update users of status changes. A robot vacuum cleaner, for example, can tell you when it is time to charge or when it gets stuck somewhere in the room. It could say “I’m stuck, please help me”, instead of beeping noises.
In Iron Man, Jarvis is Tony Stark’s artificial intelligence assistance. Could one day we have our own JARVIS? Of course. The voices could also be personalised; a bespoke voice to assist your daily needs or even have conversations with you. This is uniquely beneficial for people who have their voice because of diseases such as Amyotrophic lateral sclerosis (ALS). It’s a motor neurone disease that causes the death of neurons controlling voluntary muscle such as the tongue, arms and legs. It’s the same disease that struck Dr Stephen Hawkings. He relied on a speech-generating device, built by Intel. Read more about it here.
ALS has taken many people’s ability to speak. ALS Association has partnered with a startup, Lyrebird, to work on a non-profit initiative called Project Revoice. Their mission is to ensure people with ALS don’t have to suffer after their voices are robbed. They will work with ALS patients to create a digital clone of their voices that aims to fully recreates the unique essence, nuance and accent of any individual.
Text to speech can aid speakers and voice actors to prepare their speeches. But it will not fully replace their art, like seminars, sermons, spoken word performances. At the moment, the quality of AI voices cannot mimic the soul of a person as they lack an identity.
Motivational speeches is an art form. The current challenge is for machines to become creative and create their own style. There is a standardised editor that lets you add emphasis to certain words and change pitch and speed of sentences. This still doesn’t add character to a machine’s voice, but it’s exciting to know that there is potential for this technology. The technology will need to be regulated as it gets perfected. Voice cloning does cause for concern on ethics. Bad apples will abuse this technology for voice phishing for example, but the benefits outweigh the cons.
Text to Speech should be accessible for all, that’s why Verby was created. Its simple user interface allows visitors to easily convert text to speech. Create an account and get 1000 characters for free.
Try Verby.co now