How to Improve Your Chatbot —Coverage vs Quality

Improving chatbot topic understanding used to be painful, but there’s an easier way!

Josh
IBM watsonx Assistant
7 min readOct 9, 2020

--

Photo by Vek Labs on Unsplash

Setting your virtual assistant or chatbot live is an exciting (and sometimes laborious) accomplishment. You’ve built out intents, crafted your dialog, determined the personality of your new assistant, and taken a big step forward in meeting the needs of your customers in a familiar way. But, soon after launching you find yourself asking — now what? How do you know if you are adequately satisfying your customers? Are you evolving rapidly enough with their ever-changing needs? These are both totally normal (and correct) questions to be asking yourself.

Chances are, you’ll quickly notice a few areas where your assistant could use some improvements. But how do you know which problems to prioritize first? Let’s first start by classifying the kinds of problems you could be having. The first question you should ask yourself is:

Do I have quality issues or coverage issues?

Quality refers to the quality of the response from the assistant. You want the assistant to respond with the right information when a question is asked. Some examples of quality problems are:

I ask, “what’s it like outside?,” and the assistant responds saying “The sky is blue!”

I ask, “what’s my balance due?,” and the assistant responds saying “The APR on your gold card is 22%”

The examples above are due to the assistant not understanding what the user has said, which is one of the likeliest causes of quality problems (and is also the one that we will focus on in this post). However there are other causes of poor quality to consider, such as:

  • Two or more conflicting intents
  • Incorrect dialog/response flow
  • Poorly worded response

… and all of these will be covered in future posts.

Coverage refers to the messages your assistant is trained to “cover” or handle. You obviously want the assistant to be able to cover a good proportion of questions, otherwise, it’ll be frustrating to use. Some examples of coverage problems are:

I ask, “What’s it like outside?,” and the assistant responds “Can you please rephrase?”

I ask, “what’s my balance due?,” and the assistant responds “I’m not trained to handle balance questions, sorry.”

We’ve found that misunderstandings and quality issues are much more frustrating for users than a lack of coverage. That’s why we typically recommend focusing on quality first before moving to focus on increasing coverage.

Furthermore, benchmarks on quality and coverage are often specific to the setup and business goals of each assistant. If you’re still not quite sure whether you have a coverage or quality issue, I’d recommend checking out our best practice guide. It contains some excellent guidelines to help you build out a top-notch improvement process for your virtual assistant(s).

It’s also important to note that an assistant won’t get every single thing right in one shot, at least not when it’s first launched. The reality is that the way you initially train your intents probably won’t perfectly align with what your users are asking, but that’s ok!

That’s why every assistant needs…

  • To be launched as quickly as possible (so that you can begin learning)
  • A way to clarify vague requests with users
  • A repair strategy when something goes wrong
  • A fallback strategy when the assistant doesn’t have an answer
  • The ability to learn automatically

(the good news is that Watson Assistant does all of this for you)

Improving Quality

If quality is an issue, it’s likely that your customers are spending too much time and effort getting the answer they seek. Think back to the earlier examples:

I ask, “what’s it like outside?,” and the assistant responds saying “The sky is blue.”

I ask, “what’s my balance due?,” and the assistant responds saying “The APR on your gold card is 22%”

In this case, the assistant is responding with a less than ideal answer to the question being posed. Rather than requiring your customer to reword their statement or responding with a frustrating statement like “the sky is blue,” Watson Assistant customers can use a feature called disambiguation combined with a new feature called Autolearning to help.

Disambiguation works to get clarity from the customer when there is more than one dialog node that can respond to the customer’s input. Instead of guessing which node to respond with, your assistant will share a list of the top options with the customer, and ask them to pick the right one. In the example, rather than responding with, “The sky is blue,” the assistant would provide a list for the customer to choose from:

  • Color of the sky
  • Temperature
  • Chance of precipitation
  • Humidity
  • Wind

In this case, let’s assume I want to know if I need to wear a rain jacket or not. I’d simply select chance of precipitation from the list and the assistant will respond with the dialog response I am interested in.

Autolearning works in concert with disambiguation to further optimize interactions with your customers. Autolearning observes and learns from your customer’s behavior and over time will work to provide the most accurate responses to the questions. So sticking with the same example, if 100 customers asked, “What’s it like outside” and 90 choose the chance of precipitation node, Autolearning will work to optimize for that flow. Don’t worry, it won’t get carried away creating new responses you’re not aware of. Instead, it will work to re-rank and eventually eliminate the need for clarifying responses.

To use this feature, step one is to ensure that you have disambiguation turned on.

Next, choose your production assistant as a source for observation.

Finally, turn Autolearning on.

You can now sit back and watch as your assistant improves itself! To track the impact of Autolearning, you can use the notebook linked off of the configuration page (above). Note: We’re currently hard at work on building more embedded metrics so that you can track all of this directly in the product.

Improving Coverage

If you think that coverage is an issue, then it’s likely you need to either create new intents, or improve the training on the ones you’ve have. If you’re like me, this sounds like a daunting task. How can I possibly know all the potential questions my customers might ask about, and which are the most important to get into production first? If you use Watson Assistant, our Intent Recommendations tool makes solving this problem really easy. It works by clustering common questions your users are asking and provides a detailed list of suggested new intents to create, or additional examples to add to existing intents.

So how does Intent Recommendations work in practice?

First, select the data source you’d like to use:

Data sources can include logs from your current assistant or conversational transcripts from live agent interactions — both of which will quickly identify questions your users are already asking. If you upload a CSV, make sure it’s in the correct format.

Next, Watson will automatically group utterances from these sources, resulting in a list of suggested intents, how often theses suggested intents occur, and example utterances they correlate back to.

From there, select the recommended intent you want to create, the example statements you want to train that intent on, and voila! It’s as easy as clicking, “create new intent”.

Finally, if you see recommendations that look similar to existing intents, you can click “add to existing intent” to add those recommendations as new user examples.

Both Intent Recommendations and Autolearning are now available in the Plus and Premium plans of Watson Assistant. We think both of these new features will take a lot of the hassle out of improving your Intents!

And keep an eye out for more posts in an ongoing chatbot improvement series, they’re coming soon!

New to the virtual assistant world? Try building one today with Watson Assistant.

--

--