​Voice Acting and Parametric TTS​

​Parametric TTS and AI will replace or supplement all voice acting within 20 years. I’m a software developer trying to ensure humans benefit from machine learning. The future is good; Actors will have Character assets and Creators will have affordable Scripts.

To understand why the way something is the way it is; that’s everything. Answering why is the most important question we can answer. No one person has the correct answer either. Only from a myriad of perspectives can we hope to dispel our own ignorance on a subject. This post is my perspective on the impact parametric TTS and AI will have on the voice acting industry.

Heard about this AI revolution? Probably. Know how this stuff works and the impact it’ll have on your industry? Probably not. Take a second and play with an actual neural network, in this Google TensorFlow playground http://playground.tensorflow.org. I really enjoy this kind of stuff, but I’m a software developer and I don’t expect the typical voice actor to care about the technical details. What I do expect is for every voice actor to start thinking how this technology could affect them.

Parametric TTS is the most relevant advancement in machine learning. Why should actors care? Computer-generated speech is famous for sounding more like a computer than a human. But do you know why? It’s because they have traditionally used the concatenative model. The concatenative model uses a large database of prerecorded speech fragments from a single actor and when strung together they formed speech. However that’s not how parametric TTS works. Parametric TTS uses neural networks to generate audio that has recently proven to be indistinguishable from human speech. Read this article released September 2016 by DeepMind, a Google company, about their WaveNet framework. What was once limited to academia is now becoming commercialized. So what does this mean for the average voice actor? Well it depends on how you respond to this new reality.

You can ignore, fight, or adopt parametric TTS. For the lucky few who already have multi-million dollar contracts with triple AAA studios or the Fortune 500, ignorance may be your best choice. Why worry about this when your livelihood is not going to be affected? Such is the luxury of having more money than you know what to do with. Feigning ignorance also keeps you from having to listen to any struggling actor whose interested in your thoughts on the subject. Your thoughts are simply, “I don’t care” or “It’s a fad”. I joke, but some actors will choose ignorance. Just don’t be ignorant for the wrong reasons.

Maybe things are just good enough for you and resistance seems like the best solution. You could resist; find and break all the machines. Organize similar like-minded Luddites and have yourself a modern day power-loom riot. Historically that’s only proven to delay the inevitable. You could try and push legislation that prevents its usage, but in a pro-job House and Senate that’ll likely never pass in America nor any country trying to compete globally. Ok, so what does it look like to adopt parametric TTS as a voice actor?

Well we’re not exactly sure either, but we are designing our character-first voice acting platform for it! We see this trend coming and want it to benefit as many people as is possible. The early implementation will probably look like those auto-generated replies you see in Gmail. It’ll automatically create a bunch of Performances that you can approve or replace with your own traditional recordings. This assisted audio generation will enable new levels of efficiency and make larger Scripts more accessible to small creatives. Adoption therefore could be as simple as joining a voice acting platform that implements parametric TTS for you.

Accreu, at accreu.com, provides character-first voice acting for small independent creatives. Our Actors and Creators, members of the next economy, are building stuff dreams are made of. As technology continues to push society forward, everyone will look to these pioneers for a path forward.