
The Conversational UI in the Middle of the Room
CUI Lessons Learned from IVR
While not a new idea, the past year has seen a groundswell of interest in the topic of conversational commerce and conversational user interface (CUI) design. By and large the discussion has focused on text-based or chat-based contexts, whether a rethinking of iOS push notification or speculation on what Apple might be able to accomplish with something like MessageKit, the mounting rumors surrounding Facebook’s M development which are expected to be announced next month, Slackbot, or most recently, Microsoft’s now official “conversational platform” plans powered by Cortana’s minions. Largely spurred by the “accidental” success of WeChat in Asia combined with recent advances in natural language processing and artificial intelligence, the promise of conversational commerce is underscored by the perceived convenience of interacting with brands and services in much the same way we interact with family and friends and the comfort of asynchronous text-based interaction generally. The discussion has also had the not so subtle sub-text of moving past our app-centric world, especially by those players who have not benefitted from a robust mobile app ecosystem.
For us at Versay, designing compelling conversational experiences is core to what we do, and has been for the past fifteen years. So it is with great interest and, at times, a bit of déjà vu, that we have been watching this surge in interest in the paradigm-shifting potential of conversational commerce. The conversations we have designed and built over the years have been focused on replacing legacy touch-tone interactive voice response (IVR) systems with technology-agnostic solutions, in many cases (but not all) powered by advanced speech recognition software. We have improved enterprise customer care over the telephone by making these systems more intuitive to use and, in so doing, we have advanced best practices informed by human-computer interaction and usability research, learnings from cognitive linguistics, and speech science.
Certainly, there are differences between speech IVR systems and emerging chat bot systems. For example, a text-based UI can be much more forgiving. We have the luxury of autocorrect and the opportunity to revise our request before sending if we think the intent might be confused. Factors such as regional accents and background noise that have long plagued the effectiveness of speech recognition are non-factors with text input. And perhaps the most striking difference is the asynchronous experience of chat — the conversation can be interrupted and picked back up again on demand. On the phone, much of the work we do is to ensure the conversation stays on course and the turn taking between the customer and the system flows as naturally as possible, given the real-time nature of a telephone conversation.
Yet, given that both types of systems are intent-driven and conversational, they share many similarities too. There are a number of learnings one can take from the work that has been done in the area of voice user interface (VUI) design that can be applied to the broader context of text-based conversational applications, chat bots, and the like.
Develop your “Persona”.
In the IVR world, as with the text-based world of customer support, you need to speak in a tone that matches your business voice. I’m not advocating for a robot that pretends to be a real person, but the tone of the IVR, text-based robot and even human agents must be trained to match the image your business would like to project. There was a moment in IVR’s past where persona was all the rage. In many cases, it was overdone and got in the way of an effective experience. If recent chat bots are any indication, we are seeing this same challenge cropping up again
Keep it simple.
Don’t insult anyone’s intelligence with too many instructions. As pointed out by Matty Mariansky you need to introduce yourself and provide an example of what you can do with a call to action. In the IVR realm, this is crucial in a natural language app. The open ended “Say Anything Main Menu” convention may require a quick example to effectively move the caller to the next step:
“How May I help you? You may say things such as, ‘I need to file a claim.’ or ‘I’d like to pay my bill.’ Just go ahead and tell me what you would like to do.”
Remove the training wheels.
This phrase (another hat tip to M. Mariansky in the article referenced above) captures what happens in a well-designed IVR based on what we know from the caller’s phone number. If they are a frequent caller, we adjust the greeting to remove the examples and allow the caller to “barge in” on the prompt if they already know what the question is asking.
And the opposite is true. Sometimes you detect struggles and you have to put the training wheels back on…
Handle errors gracefully.
Error handling should not waste the user’s time, but should also rephrase the question to give them a chance to recover and get back on track. With IVR, if a caller is struggling with the speech recognition, we will provide more details about the information we are trying to gather or might flip input methods altogether and suggest using the keypad instead mapped to a specific list of options.
Be conscious of turn-taking.
Silence is a tool. We have to know when to accept silence because the caller is thinking or looking for the answer to the question and we have to take an educated guess as to when the caller might be politely waiting for us to repeat because they don’t know if saying the phrase, “Could you repeat that” will work?
This issue isn’t as pronounced in text-based UI, but the question is the same, how long should you wait for a response? What is the polite way to ask: “Buddy? Are you still working on that answer for me?”
Allow users to “barge-in” to the conversation.
You cannot barge-in on a text prompt in the same way you can a speech prompt, but interrupting is a natural thing and any Conversational UI needs to be prepared to adjust if the conversational flow changes. For example, what if the question that is prompted on the screen isn’t the one that the customer wants to answer? They’ve just remembered a piece of info that was forgotten in the last step and frantically type: “Go back.”
Always provide backup.
Right now, the text-bots helping me fill out my Slack profile and assistants, like Alexa, don’t provide an easy way for me to get to someone if I need further assistance. In most cases, I don’t need them to today, but as services continue to expand and grow more complex, this will become more essential. (Welcome to the game, Skype!)
The above examples illustrate there are a number of areas where well-established VUI design best practices can inform what to do, and more importantly, what not to do, as we come back around to conversational design within a messaging context and the possibilities enabled by these new platforms. There will be growing pains, no doubt, but many mistakes and failed experiments can be avoided.