All about entities (Part 1): Dictionaries and Patterns

Don’t let your assistant miss a word!

Published in

IBM watsonx Assistant

5 min readApr 25, 2019

Entities in IBM Watson Assistant let you hone in on exactly what your end users are saying, and allow you to build a more personalized assistant. You can use dictionaries and patterns to define entities quickly, or for more advanced use cases, you can train Watson to recognize entities from the context of what a user has said.

In this two-part series, Dan O’Connor and I present a holistic view of entities, and share some tips based on real-life use cases to help you get the most out of them. Part 1 covers the basics, and part 2 focuses on contextual entities.

This article assumes you are familiar with Watson Assistant, so we’re going to fly through the basics. If you need a more detailed introduction, please see the getting started tutorial.

What are entities and intents?

To understand entities, you first need to understand intents. Intents and entities are the building blocks of natural language understanding:

1. Intents (prefixed with #) are the goals your customer is trying to accomplish. You train Watson Assistant to detect intents by providing examples of what your users would say (sometimes called “user utterances”) to express that intent or goal.

2. Entities (prefixed with @) are concepts or objects that matter to the tasks your assistant can perform.

Let’s say your assistant will help your customers buy office supplies from your online store. Your assistant needs to understand a user’s #makePurchase intent. Your assistant also needs to understand what your user wants to purchase, which you can model with an @officeSupply entity.

Entities and synonym values

To define an entity, simply give it a name and list all the “synonyms” a user might use to refer to it.

Let’s say you are selling two kinds of office supplies: pens and notebooks. You can define an “entity” named @officeSupply, with two “values” of type “synonym”.

When an entity value is defined as a “dictionary” by selecting “synonym” type, Watson Assistant performs an exact match of synonyms to recognize the entity. Note that Watson Assistant also includes the name of the value (e.g. “notebook” or “pen”) as one of the synonyms.

Let’s use the Try It Out panel to test the entity definition, with the following user utterance:

I want to buy a ballpoint pen

Watson Assistant recognizes “ballpoint pen” as an entity. After recognizing the entity, Watson Assistant returns back the entity and value that match the synonym (i.e. @officeSupply:pen). The confirmation text comes from the dialog whose condition is the detection of @officeSupply:pen.

Entity and value hierarchy

As you may have noticed, an entity and its values effectively provide a hierarchy that is one level deep: Entity (e,g. @officeSupply) is the parent, and values (e.g. @officeSupply:notebook) are the children. Synonyms of each value (e.g. “workbook” and “pad”) are different ways a user may mention an entity value.

One advantage of this hierarchy is the ability to condition a dialog using the entity or each of its child values. In some cases, having a parent dialog node that conditions on the parent entity may simplify your dialog design. Such a dialog node can perform preliminary tasks, like making a function call to an external system, or setting context variables, before you implement the dialog responses customized for each value.

This hierarchy can also be useful when adding examples to your intents. Watson Assistant allows you to directly reference an entity name in the user examples of intents.

Using direct entity references can improve the performance of intent classification. However, if you decide to use direct entity references, there are a few details you need to consider. You can read more about this in the documentation.

Patterns for entity recognition

Sometimes entities can be expressed concisely with patterns. For example, phone numbers, product IDs, or credit card numbers often follow a well-defined pattern. You can use regular expressions to define pattern entity values and teach your assistant how to recognize such entities.

Let’s say your users keep asking for printer ink cartridges, whose product IDs always start with the letters INK and followed by a one, two, or three digit number. You can define a new pattern value called PRINTER_INK as follows.

Let’s assume you don’t sell ink cartridges in your store directly, but you would like your assistant to direct your users to the right product page in a partner’s website in your dialog. You can extract the entity value using (@officeSupply.literal), and use it to create a customized response in your dialog.

Fuzzy match

Fuzzy matching adds some flexibility to strict synonym matching by accounting for linguistic variances, such as misspellings and slang. Fuzzy match is always enabled when an entity is created, but you can turn it off easily in the entity details screen.

Let’s say we have not added the plural form “notepads” as a synonym of @officeSupply:notepad, and then a user asks the assistant the following question:

Can I please buy 3 notepads?

If @officeSupply entity has fuzzy match enabled, Watson Assistant will recognize and normalize the phrase “notepads” as @officeSuppy:notepad entity value.

You can define sophisticated bots with dictionaries and patterns. Fuzzy matching makes dictionaries even smarter, and but their flexibility is limited to linguistic variations. What if you user says something like this?

I need to buy those yellow sticky things for my home office.

Clearly, your customer wants to buy a PostIt, and the phrase “yellow sticky things” is an @officeSupply entity. How can you design your virtual assistant to deal with such vague entities? The answer is contextual entities, and you can learn all about them in part 2 of this series.