Replacing My Old Interactive Voice Response (IVR) System with IBM Watson AI Technologies

Published in

IBM Watson Speech Services

9 min readSep 11, 2019

More and more transactions are done through an IVR | Photo by Austin Distel on Unsplash

In a previous post, I covered how a lot of customers currently use an IVR to deliver some basic information and process some basic transactions (press “1” for hours of operations, press “2”…, press “0” to speak to an agent).

Over the past few years, some customers started to bring chatbots with AI into their websites, delivering some quick value to their users. Now, they wish to replace their old IVR with their cool new chatbot. Where to start?

First, let’s recap on the components required for a typical IVR solution built with Watson services :

IBM Voice Gateway (VG)/ Voice Agent (VA) — manage inbound calls using SIP and orchestrate the integration with the other IBM Watson services below
IBM Watson Assistant (WA) — Conversational service containing your business flows (chatbot)
IBM Watson Speech-to-Text (STT) — Automated Speech Recognition (ASR) service that converts an audio stream into text for other text-based APIs services
IBM Watson Text-to-Speech (TTS) — Converts text into a natural-sounding audio voice
Service Orchestration Engine (SOE) — Application layer that integrates many API services and backend systems.

IBM has released in late August 2019 a new offering called “Watson Assistant for Voice Interaction”, packaging all the required Watson components mentioned above for IVR solutions. Check the following Medium article from Tom Banks for more details

Diagram showing the data flows and components involved in Watson Assistant for Voice Interaction offering

I already covered the Watson Speech-to-Text component in my previous series of articles “Watson Speech-To-Text: How to Train Your Own Speech “Dragon””.

This article provides some guidelines on how to bring your existing fully functional Watson Assistant chatbot or add a brand new one into your brand new Watson IVR.

IMPORTANT NOTE: For simplicity purposes, the following diagrams are shown in sequence to highlight the different steps. It is implicit that you should ALWAYS iterate within each phase until you get the expected outcomes.

Review and Confirm Your Use Case

Although very obvious, this phase is the most critical as it will become your foundation. As you review your use case(s), you should know clearly who your target users are and within them, the different personas that will influence the user experience and the business flows. With an existing web chatbot, you have access to some very useful insights: usage, completed transactions, what is working and not working, etc. If you have an existing IVR, identify the pain points and cause of frustrations — like why they are ALWAYS pressing “0” to talk to an agent. Go through each step, validate, and confirm.

The User Experience (UX) is also extremely important, especially when building a new IVR. Users do not interact the same way with a web chatbot as with an agent or a voice system. It’s easy to read long and detailed text responses but listening to it can be painful and users tend to be more impatient with voice responses. Don’t be surprised — it’s not unusual having to redesign your business flows a little and curate your long responses with something more “voice-friendly”.

If one step is not clear, don’t be scared to iterate — go back and revisit the previous step(s), validate, and redo if necessary. One common mistake is having a scope too large (eg. Answering all insurance questions) versus a smaller more achievable scope (eg. answering questions on car insurance). Always start small, then incrementally add to it.

At the end, you should have enough to start designing your business flows on a whiteboard, flip charts or using a software (Visio, Omnigraffle).

Design Your Business Flows

As you are designing your flows, the first thread is to identify your intents (what people are looking for) and the data inputs (membership number, date of birth, etc) you need to go through your flows to complete processes. This goes into Watson Assistant.

The second thread is to identify the different components required for your architecture. The last thread is the identification of your different integration points. We will cover them in more details later.

At this phase, you start to align the different teams and resources for each area.

Identify Your Intents and Data Inputs

Now that your intents and data inputs are identified, you need to plan your data collection strategy — refer to the Data Collection section of my previous Watson STT Part 1 article for more details.

With your collected audio data and transcriptions, you can now configure your intents, entities, and dialog flows in Watson Assistant. If needed, you will also build and train your Watson STT Language Model(s), Acoustic Model (s) and Grammar(s) for voice recognition — for more details, check Watson STT Part 2 article.

As you are building and training the different Watson APIs, you also start building your unit test plans and your user acceptance test (UAT) plans.

Identify Your Different Architecture Components

In this thread, you start documenting your architecture and how each component interacts with each other. This is where you decide if you go full IBM Cloud, full on-premise (Cloud Pak for Data) or Hybrid Cloud, depending on your functional, non-functional, data and security requirements. You also design your multiple environments for development, UAT and production.

This phase typically involves multiple teams from networking, firewalls, hardware appliances to ordering hardware and instances. This is where most project delays occur so you must plan accordingly and involve them early.

Identify Your Integration Points

If you have a front-end call center solution that receives and manages incoming calls to agent queues, you need to consider how you will integrate it with the IBM Watson solution. Some things to consider are the supported voice protocols (eg. SIP), which system will anchor the calls and what data needs to be transferred between Watson and the call center solution.

If your business flows require some sort of data validation and update against existing backend systems, you need to know which ones and if they have existing APIs. You also need to know if you get the right information from these APIs. If not, you will need to build/enhance them.

To handle the data exchanges between all these different components, you need to start designing your Service Orchestration Engine (SOE) with all the required data formats and transformations, storing and managing the payload needed to complete your business flows, and more.

The outputs of these three previous threads should be documented in an Architecture Design document. Keep in mind that this is a living document: it does not have to be final and cast in stone to begin your project. I can tell you that you WILL update it and you WILL discover things you did not think about at the beginning… and that’s OK.

Design Your Architecture and Start Building

This is the phase where you start to execute and build your solution. With these brand new components available in your solution, you will add new tests cases and adjust existing ones in your Unit and UAT Test Plans.

One thing that I found really useful on a few voice engagements was to have a “vanilla” web chatbot (text only) in parallel with the voice environment. This was really helpful to validate some specific “SOE-to-Watson Assistant” functionalities like the Watson Assistant flows, inputs and responses from backend APIs, without having to make a phone call. When it works through the web chat and you have a problem with the voice environment, it narrows down your list of possible root causes.

Although not shown here, it is recommended that you break your builds into sprints (typically 2 or 3 weeks) using the Agile Methodology. Conduct playbacks and collect the feedbacks. On every customer engagement I was involved with, I strongly recommended to involve the target users as early and as often as possible because THEY are the ones who will determine if your solution is a success or a failure. To this day, whenever I bring it up with the clients, I always get pushbacks from them and at the end, it’s always the number one lesson learned I always hear from them. The longer you wait, the greater the risk… Trust me: your target users will be so grateful that you did it.

Conduct Unit Tests and UAT Tests on Your Watson APIs

Now, we are ready to start testing your different test plans with the different units. Here are some examples:

When I call the 800 number, I hear the Watson greeting message
When I give an intent, I get the expected response (right branch of the flow)
If I give a data input (eg. membership ID), it plays it back properly
When I provided a valid membership ID, I hear the right confirmation (eg. “Is this for John Smith?” / branch me to the right next step)

If you get an unexpected behaviour, here are some things to check:

I did not get the right STT transcription — Add more training data to my STT Language Model adaptation or fix my STT Grammar. Once done, re-train STT and re-test again.
I got the right STT Transcription but not the right Watson Assistant intent or entity — Add utterances to intent or add entity synonym to existing entity, then re-test.
The Watson Assistant flow is not behaving as expected — Check you Watson Assistant flows through the web chatbot and make sure they work as designed. Check your context variables and data that you should receive from the SOE. Make sure your backend systems are available and your APIs are returning the expected data payload.

Run multiple iterations of your Unit and UAT tests, enhance the WA and STT training where applicable, and once complete, move these changes from your Development environment to your UAT environment.

Deploy to a Small Pilot Group in Your UAT Environment

Identify a small group of target users as your pilot users. If possible, try to bring new users who are not familiar with the system. You’ll get more representative feedback.

Provide a copy of your UAT Test Plan with valid data to each one of them, then deliver a training session. Explain to them what to do and how to collect their feedback: pass/fail, what the like/did not like, where they struggled, etc. Use a document or a spreadsheet.

Collect the audio and transcriptions from these tests, especially the failures. By listening to these calls, you will easily find explanations: hesitations, noise, stuttering. Make sure your flows behave as expected in these situations. You will also find other gaps that will require more training for Watson Assistant and STT, fixing some dialog flows, identifying unexpected behaviours, etc.

When results are to your satisfaction, you can now deploy to Production.

Move to Production

When you deploy to production, it will never be a “fire-and-forget”.

You will most probably get new users you never accounted for with new accents, jargons, environments and very creative ways to ask for information.

And when you think you have it under control, think again. As your users become familiar with your solution and processes, as new devices are available, they evolve and their interactions as well.

Monitor the collected conversations, identify “opt-outs” or dropped calls, find out why and adapt accordingly. In some cases, you will just need to enhance the Watson APIs with new incremental training data.

I bet that you already have new use cases lined up to add to this solution. No problem! Now that you know how to “rinse and repeat”….

Enjoy!