SpeechTEK 2019 Review: Conversational AI is now all about the telephone channel

Yves Normandin
May 3 · 4 min read

Conversational AI was clearly one of the biggest themes this year at SpeechTEK (Apr 29 — May 1, 2019, Washington DC). And SpeechTEK being a speech technology conference, the emphasis was naturally on voice, rather than text conversations. Conversational AI is first and foremost about Intelligent Virtual Agents/Assistants (IVAs), which are robots that provide services through a user interface that simulates human-to-human communication. Of course, there is nothing really new about all this. Chatbots have been all the rage for over three years. What was really new this year at SpeechTEK was this unmistakable feeling that conversational AI over the phone had suddenly become mainstream. Clues were everywhere, starting with the strong participation of Google Cloud, Twilio, and Gridspace, all Diamond Sponsors at the conference. Both Twilio and Google Cloud had keynote sessions, with conversational AI as the main topic. But the strongest hints of all came from casual conversations with attendees, which were by and large seeing telephone IVAs as inevitable and were looking for the best solutions to turn this into a reality.

Telephone calls are not going away

Maybe this has to do with the fact that call volumes into contact centers aren’t going down but customer expectations are going up as a result of their experience with personal assistants like Siri and Google Assistant, as well as smart speakers like Amazon Echo and Google Home. In that context, companies cannot continue to ignore their old IVR system that is increasingly becoming the worst portion of their customer’s journey with them.

The challenge is how to provide that great conversational experience over the telephone. In the chatbot world, the great majority of chatbot developers have long ago realized that it’s hard to make sure that natural language understanding (NLU) technology works well enough to provide a great user experience and have mostly resorted to adopting very directed menu-based interactions, limiting the use of NLU to where it is absolutely necessary. And, by and large, that works quite well for many use cases. But in the IVR world, that’s not an option because directed, menu-based interactions are what most companies offer today and users simply hate it.

Accuracy is key…

In their talks, Google Cloud rightly insisted on the importance of speech-to-text (STT) and NLU accuracy. Last year, Google Cloud introduced enhanced acoustic models for the telephone, which cut error rates in half and I’m sure they will continue to improve accuracy. I also expect STT vendors to eventually introduce features that will enable developers to further improve accuracy by being able to tell the engine what types of responses are most likely. Even if, in principle, users can say anything, the odds are high that if the bot is asking a question, the user will respond to that question rather than say something totally unrelated. Being able to give indications to the STT engine about what users are likely to say can make a huge difference in accuracy. Speech technology vendors like Nuance have known that forever and they make sure that developers have this kind of control.…but conversational user experience design is critical

Another, but related topic that was discussed at length at SpeechTEK is the importance of conversational user experience (CUX) design expertise and the lack of such expertise in the market. This has been an issue for chatbots, but to a certain extent companies have been able to work around it by leveraging rich media features available on messaging channels. On the telephone, where the interaction is primarily through a voice conversation, CUX expertise is critical. Simply managing a conversation that is natural and productive to the user requires strong CUX skills. But the challenge is even greater in the context where the bot always has to deal with uncertain STT and NLU inputs and therefore has to use efficient repair dialogue to deal with this uncertainty. There was, by the way, a very interesting presentation on this topic at the conference by Bruce Balentine.

Beyond IVAs

Finally, beyond IVAs, there was also a lot of discussion at SpeechTEK on other dimensions of conversational AI, namely, agent assistants and speech analytics. Agent assistants provide real-time guidance to agents based on the analysis of the ongoing conversation between the agent and the customer. This was one of the topics discussed by Google Cloud during their keynote (part of their Contact Center AI offering), but other vendors also presented solutions in that space, namely ttec Associate Assist and Gridspace Relay.

So, all in all, a very interesting conference, from my perspective. Let me know if you detected other important trends at the conference.

