When Speech Synthesis is Human

Published in

Social Robots

1 min readDec 29, 2017

Things might start to become spooky when speech synthesis gets to human level and we’re unable to differentiate between computer generated and non computer generated. I’m reminded of the Jolly Roger number you can merge to telemarketing calls to get them to hang on as long as possible while you laugh hysterically:

What if this could be applied to rescheduling appointments? “Alexa, call my doctor and reschedule my appointment to 5 PM tomorrow or something as close to possible.” Alexa could then convincingly call out and make the appointment change and report back.

TTS powered agents can completely change the robocall industry and actually make it acceptable again for these agents to call us. There’s also the nightmare scenario of these bots being able to complete harass someone or mess with their appointments and dates, or say they’re calling on behalf of someone they are at a rate that would be unstoppable.

The other possibility is that with enough samples, the TTS could mimic our voice (Lyrebird and VoCo are examples of these technologies) and could be even more convincing. These synthesized versions of ourselves could handle our inbound calls.

We’re going to have to deal with this reality this coming year.

When Speech Synthesis is Human

Written by Leor Grebler