Which Text to Speech technology generates the most human-like speech?

Eloy Ferrer
Voxabot
Published in
1 min readJun 21, 2022
Photo by Lyman Hansel Gerona on Unsplash

When people that have not followed the evolution of TTS technology think about synthetic voices, they normally think of voices with a typical clunky robotic sound. But the fact is that Text-To-Speech technologies developed in the last few years have made real progress, to the point that current synthetic voices are almost indistinguishable from real human voices.

But in this race to produce human-like synthetic voices we find some companies and technologies that are producing outstanding voices while others are lagging behind.

Below you can listen to some samples created using popular Text to Speech providers. I have ordered the voices from most human-like to less human-like, but since this order is somehow subjective you should listen and judge by yourself.

#1: Audio created with Microsoft Azure Speech. Voice: Christopher

#2: Audio created with Amazon Polly. Voice: Mathew

#3: Audio created with Google Cloud Text to Speech. Voice, WavenetC

#4: Current synthetic voice created with Naturalreader

#5: Current synthetic voice created with Readspeaker

#6: Audio created with Microsoft Windows Desktop voice, David

--

--

Eloy Ferrer
Voxabot
0 Followers
Editor for

Voxabot cofounder — Engineering a better experience with text and speech