Dialogflow — the optional words

While many stories on Dialogflow describe how you can build a chat bot in minutes, this one is about a bot that evolves beyond a toy project.

When attempting to cover large set of user utterances , we find ourselves struggle to maintain many variations of phrases in which some words are optionally dropped from the phrase. This article suggest a method to automate some of the labor involved in covering the many expressions that user may throw at our chat agent.

Dialogflow (formerly Api.ai) is a service owned by Google that allows developers to build natural language interfaces to automated processes in your organization or distributed systems.

Dialogflow brings out of the box some natural language understating (NLU) and dictionaries (e.g. “July 3-rd” is a date, “Mike” is a person’s name) but the real power of the tool is the ability to train it machine-learning (ML) engine to your domain using training phrases and contexts.

A training phrase is an natural language phrase that the developer of a chat bot expects users to type in. For example:

[1] show me all articles with the words global warming

Obviously, developers cannot predict every utterance their user may try. Instead the NLU engine attempts to detect what the intention of the user is ( intent), then extracts the parameters (slots) of the query. Having the intent and slots , the developer can now serve the action or response.

Suppose we add the example phrase above as a training phrase for an intent we might call “search-articles” . We kinda expect Dialogflow to detect the intent from the appearance of the word “articles” early in the phrase. But how would it determine where in the phrase does the description start? Is it “with the words…” or “the words global..” ?

In general, the more training phrases — the better the accuracy of prediction. So if you added the training phrase:

[2] find essays containing global warming

to the same “search-articles” intent, we now expect that Dialogflow will infer that “find” is like “show me” , that “essays” is like “articles” and “containing” is like “with the words”.

AI technology is progressing but the kind of inferences are not possible without some additional human guidance. Mainly, because the human language can be ambiguous. Consider the training phrase:

[3] find articles containing violent language

If a user made this query — was the intention to find articles that contain all sorts of ‘f’-words , or an attempt to find those articles with the verbatim “violent language” string in their text?

Entities to the rescue!

Similar to basic school methods for pointing grammatical structures like subject,verb,nouns, we can annotate word groups in the phrase as having some role that is important to our software application.

If you annotate the words “global warming” of phrases [1],[2] as the slot assigned for the search text, and do the same for the words “violent language” in phrase [3], you provide Dialogflow some anchors to predict where will the description part reside in a yet unseen query coming from a user later on, such as:

[4] find essays with the words how to train dogs

Entity represent the role of words in the sentence,and the annotation provides the knowledge of where in the phrase such Entity appears.

Another important benefit of creating and annotating Entities in Dialogflow is to create synonyms between words. You can teach Dialogflow that “find”,”search” and ”show” are all equivalent if they appear is the same position in the phrase. This feature saves you from adding multiple phrases with similar structure.

Real people, real world

Alas, users don’t know or care about your exceptions for their query. Keeping up with our example, it is likely that at some point your logs will show that your bot failed to understand a user that attempted the query:

[5] articles about how to train dogs

Is that so different from training phrase [3] ? Why did it fail?

Let’s start with the easy piece: Query [5] uses the preposition word “about” instead of “containing” in [3]. The solution: annotate these preposition words as an entity and add synonyms that covers such preposition (“about” , “on”,”containing”, “regarding”,”talking about”…)

The second problem is that the user omitted the opening verb. Instead of “show me articles..”, the query just starts with “articles..”

Solution: add the shorter variation as training phrase as well. But we notice that this verb shortening will be required for every training phrase we had so far ! We’ll need to manually add:

“articles with global warming”, “all articles with global warming”, “show me articles with..”, “show all articles with..” , and so on

As the project evolves to cover tens of training phases and all their shorter variations — the maintenance of such training set becomes unmanageable

Optional words handling with Dialogflow API

What if we could create only the training phrases with the longest wording and use scripting to create the phrases with shorter variation?

That way, human labor is still needed to create the “master” phrase , but avoid the tedious house keeping of maintaining all shorter variations with optional words dropped

Dialogflow provides an automation API for listing, creating,updating and deleting intents here.

The human labor part will be as follows:

  1. adding the master phrase: “show all articles with the words global warming” as the training phrase in its longest version.
  2. Annotating “show all” as an optional verb entity. Let name this entity “search-verb”
  3. Annotate “the words” as another optional entity, lets call it “pre-description”

The scripting part, using the Dialogflow API is :

4. Get the list of training phrases for the search intent

5. Find the text parts in the downloaded phrase that have the entity types “search-verb” and “pre-description”. Consider the texts matching these entities as optional

6. Create 3 new phrase combinations based on dropping the optional portions from the original one

7. Add these new phrases and update the intent using the API

Final result

The phrase: “show all articles with the words global warming” span out three additional phrases:

“articles with the words global warming”,
show all articles with global warming”,
“articles with global warming”

While this was a simple example, a real life conversation interface with Dialogflow requires continuous updates and maintenance of a larges set of training phrases.

Using the Dialogflow Entity annotations and the Intent management API, it is possible to keep large dictionary of training phrase variations without manually remembering and typing each possible variation.