Conversational interfaces are everywhere these days. Chatbots are said to be conversational. Virtual assistants are conversational. And now, interactive voice response systems, or IVRs, should also be conversational. But what does it mean? I have previously written a couple of posts regarding so-called conversational chatbots, and how misused the term “conversational” often is. I have also discussed how hard it is to build truly conversational interactions. Now, what does it mean in the context of an IVR?
In this post, I will first revisit what I consider essential criteria for a conversational interface, regardless of the channel or application. Next, I will describe the specificities of an IVR that should be taken into account, and how they differ from other interfaces. Finally, I will explore what it means to design conversational interactions in the context of an IVR.
Conversational Interface Requirements
One fundamental criterion that must be met for an interface to be called conversational is the possibility for the end user to be an active participant in the exchange; in other words, the user must be able to:
- Express their requests in their own words;
- Base their requests on their actual needs, not on what the application proposes (i.e. without the need of a menu or list of options);
- Include more information when answering a question than what the question specifies;
- Interrupt an ongoing task with an unrelated question, get an answer and resume the previous task (in other words, digress);
- Change their mind, e.g., correct of piece of information, go back, interrupt and cancel a task, etc.;
- Switch topics or tasks.
And the conversational interface must be able to:
- Understand the meaning and extract relevant information from free-form spoken or written utterances;
- Keep track of all the input provided by the user to ensure that the application never asks for information that was already provided;
- Consider context when interpreting the meaning of a user’s input;
- Handle digressions and resume tasks;
- Handle change and cancel requests.
This list is not exhaustive, but it provides a basis to determine whether a given interface or application is conversational or not. For instance, a DTMF IVR is not a conversational interface, nor is a speech enabled directed dialogue phone application. Other examples of user interfaces that are not conversational are one shot question-answer sequences such as “OK Google, what’s the weather forecast for today?”, or chatbots that only handle clicking or tapping as input mode.
This strict definition of a conversational interface can also be applied in the context of an IVR, or interactive voice response system.
An IVR is just that annoying phone system with the menus, right?
Sure, but it can be so much more than that.
Here are some essential characteristics of IVR systems (as opposed to other voice only interfaces):
- An IVR is the doorway to a contact center, i.e., human agents;
- Because of that, it must be integrated in the contact center ecosystem and with the CC platform;
- An IVR is reached from a phone number, not a smart speaker device; this means, among other things, that it can be called from many different physical environments (sometimes very noisy);
- An IVR is used to reach an enterprise and obtain a service (although this is not strictly limited to IVRs, it is an essential characteristic);
- Caller population is wildly diverse, not limited to young, curious, tech savvy users who are willing to participate;
- Users calling an IVR often do so when they have exhausted other self-service resources and need the help of a human agent. They are often impatient and already frustrated;
- Although IVRs can be used to perform self-service transactions, they are more frequently used to triage callers and route them to the right agent.
So, basically, an IVR deals with callers who, for most of them, have a specific reason for calling and want to speak to an agent. A good conversational strategy should help making the journey between the caller and the agent an efficient, low effort and easy transition.
To be clear: the purpose of an IVR is not to help you find a restaurant in your area, get the weather or play a game. The IVR connects the customer to the contact center or allows the user to make transactions in self-service mode, with fallback to a human agent if they need assistance. These are very different use cases, with specific functional and technical requirements.
What is needed to build conversational IVR interactions
To create fully conversational IVR interactions that go beyond call steering applications, several criteria must be met, from a dialogue engine standpoint (which were partly listed at the top of this post), but also from a voice user interface perspective.
Here are a few VUI specific requirements that a conversational platform must provide, in the context of a conversational IVR:
- Handle real time aspects, such as timeouts and barge-in;
- Make the distinction between no-match and no-input errors, and provide appropriate prompting;
- Handle confidence scores and thresholds, for input evaluation and confirmation;
- Handle n-bests, for situations where the IVR may propose a second hypothesis upon negative confirmation;
- List navigation (i.e., navigating a list of elements using “previous”, “next”, “that one” commands and handling them adequately in real time);
- Handle global commands, like “repeat”, “start over” or “agent”;
- Provide adequate output rendering: TTS rendering, audio file concatenation, pausing, etc.;
- Record and store calls;
- Provide a decent DTMF fallback, because some people cannot or will not speak with the machine.
This is not an exhaustive list, but it sheds light on the challenges involved in finding a platform that provides both an advanced dialogue engine and IVR adapted functionalities.
Finding the ideal solution has proven to be challenging, although several possibilities exist. Our team will explore various challenges in more detail in upcoming posts.