Under the hood: all the natural language understanding technology that makes Watson Assistant powerful

Published in

IBM watsonx Assistant

6 min readMay 26, 2021

We recently published a technical paper that demonstrates how Watson Assistant does a better job understanding users than other conversational AI platforms. To achieve this, we use many complex technologies in natural language understanding and machine learning. We’d like to explain in basic terms how they work.

At heart, conversational AI software has to perform three things: understand the user’s question; find the best answer from its training, or search for it through documents; and return an answer in a concise, precise manner. Without understanding the question in the first place everything else fails, so it’s critical that the product do a good job of understanding user questions — fallibility, sloppiness and, well, humanness and all.

Let’s examine the main algorithms of Watson Assistant’s natural language understanding, or NLU, engine — intent classification, entity recognition, irrelevance detection, and spellcheck.

Core natural language understanding components in Watson Assistant

Intent Detection Algorithm

In real-world conversational AI assistants, small and data-efficient models can trump large, deep learning models in data-scarce scenarios. We have developed a proprietary algorithm that is efficient in memory and energy usage, compared to large-scale deep learning models.

We use the latest deep learning techniques combined with traditional machine learning approaches to bring the best of both worlds. We use a combination of techniques — deep learning, transfer learning, few-shot learning, AutoML and meta-learning techniques in our intent classification algorithms. Check this article and research paper to see how we stack up against competitors.

Setting aside the technology bona fides, the benefit for our customers is that they can create delightful conversational experiences even if they are first time users or even if they have less training data. The virtual assistants appear smart because they understand the meaning of user utterances (the varied sentences they use to express a question or request) very accurately.

Watson Assistant can natively understand 13 languages — English, German, French, Brazilian Portuguese, Spanish, Italian, Dutch, Czech, Arabic, Japanese, Korean, traditional and simplified Chinese. We can handle the nuances of multiple languages, like detecting word boundaries in German and Chinese, or understanding accents in French and Spanish — to mention a few.

In addition to this we recently introduced a universal language option that enables administrators to train an assistant to handle any written language (Klingon, if you want).

Intent Detection Algorithm in Watson Assistant

Irrelevance Detection Algorithm

Another tricky problem in the world of conversational AI is detecting when end users are asking about things that are off-topic. For example, a banking virtual assistant should not be expected to answer questions about, say, dinosaurs, and it should gently redirect the user back to banking.

As we explained in the intent detection section above, Watson Assistant needs very little training data to accurately understand user intents. But the challenge with that is that detecting in-domain and out-of-domain boundaries becomes more difficult.

To solve that, we model the distribution of the training data by mapping it to a domain representation using deep learning models. Based on the learned model, we treat any input whose representation doesn’t follow the distribution as irrelevant. This means that our models have a good grasp of sentences that users are likely to say when talking about banking. When the models encounter concepts that aren’t mentioned in banking, they detect that the conversation isn’t headed in the right direction.

We combine domain representation with another approach where we try to model the boundary between the assistant training data and the out-of-scope questions by using additional background data. Our customer can also train the model to recognize out-of-domain examples better by providing more counter examples. More details here.

Autocorrection Algorithm

Remember that utterances are the numerous ways that end users can ask a specific question or request one action — e.g. “password doesn’t work” or “I can’t log in” are two utterances that map to the “reset_password” intent.

Given that so much depends on understanding utterances correctly, the system needs an autocorrection algorithm to correct misspelled words. But it can’t be aggressive because business lingo often relies on abbreviations and trademarks with goofy renditions of real words. Also, it needs features like profanity filtering. Watson Assistant has all this, and more.

Underneath the covers Watson Assistant is powered by language models, phonetic models, edit distance models, and deep learning so that it can correct a wide variety of misspellings in enterprise use-cases. More details on the feature can be found here.

Autocorrection algorithm in Watson Assistant

Entity Recognition Algorithms

Let’s define “entity” in the context of conversational AI. Entities are nouns that are specific and have special significance to a business—e.g. “IBM Watson Discovery” is a software product that we want the virtual assistant to pay special attention to. In the context of a product troubleshooting chatbot, we want the system to understand the meaning of “I need help with discovery”.

We have several different algorithms for entity recognition in Watson Assistant.

Contextual entities allows your virtual assistant to detect entities based on the context of the user utterance by training a named-entity recognition model using deep learning features.
System entity algorithms identify date, time, range, number, currency, etc. in user utterances by using grammar-based NLP techniques.
Dictionary-based entities are those for which the administrator defines specific terms, synonyms, or patterns. At run time, the assistant finds entity mentions only when a term in the user input exactly matches the value or one of its synonyms.
Fuzzy match entities recognize terms with syntax that is similar to the entity value and synonyms you specify, but without requiring an exact match.
Pattern-based entity lets the user define a regular expression that can match the entity.

There are several other AI algorithms underneath other Watson Assistant features like intent recommendations, user-example recommendations, entity recommendations, and intent disambiguation that provide powerful options for subject matter experts.

Building these algorithms is a challenging task especially when the factor of scale is thrown in. In addition to this, product constraints to make it more user-friendly — like enabling shorter training times to make it feel more responsive adds to the challenge in designing and developing the NLP algorithms.

To ensure a good experience for the end-user, all these models should have fast response times — if the model is too sluggish, the conversation becomes unnatural. Improving and maintaining the natural language understanding algorithms is a difficult task — it is crucial to measure every change we make to the NLP pipeline since it affects all custom-built models deployed by our customers when we roll-out improvements. We will discuss how we test, assure the performance and measure the quality of the algorithms prior to every release in the next part of the series.

Under the hood: all the natural language understanding technology that makes Watson Assistant powerful

Intent Detection Algorithm

Irrelevance Detection Algorithm

Autocorrection Algorithm

Entity Recognition Algorithms

Written by Saloni Potdar