Speak a Foreign Language in Your Own Voice? Microsoft’s VALL-E X Enables Zero-Shot Cross-Lingual Speech Synthesis

Synced
SyncedReview
Published in
4 min readMar 13

robotic. The leveraging of deep neural networks in recent years has dramatically transformed TTS, enabling conditioning on factors such as stress and intonation to achieve higher quality and much more humanlike results. Contemporary TTS models however still perform best when dealing with a specific…

Synced
SyncedReview

AI Technology & Industry Review — syncedreview.com | Newsletter: http://bit.ly/2IYL6Y2 | Share My Research http://bit.ly/2TrUPMI | Twitter: @Synced_Global