Piercing through the Dialogflow maze — A Deduction

Vishnu Priya Vangipuram
Whispering Wasps
Published in
6 min readJun 26, 2020
Image Source: https://www.longisland.com/articles/09-21-14/movie-roundup-the-maze-runner-dominates.html

As a Conversational AI Engineer, and as an avid user of Dialogflow, one question which has always run through my mind, when I faced any challenges or otherwise, is:

Is Dialogflow an opaque box?

Is there a way at all, to decipher the inner workings of a Dialogflow Agent, in addition to the rudimentary tuning of the basic, predefined configurations and settings?

Note: In case you want to first get a preliminary overview of the structure of Dialogflow, before proceeding further, you can check out the article below:

I was working on a complicated bot. The struggle was real, with me not able to achieve the desired results even after pruning and adding new training phrases, and trying all possible permutations and combinations of the configurations of Dialogflow.

I was not given a choice of choosing another framework, and my logical dilemma was - will I be able to solve my issues and get the desirable results only when the next release of Dialogflow comes up or by upgrading to Enterprise? As an engineer at heart, those options did not sound exciting.

I couldn't wait any longer. I had to find out a way myself, but how? Was it really possible?

A lot of Google search on Google Dialogflow, and a pillar-to-post expedition on the web for a whole week and then -

Lo behold! There is indeed a not so straightforward but effective way to check for the quality of training phrases used to train our bot and also to change the way of training, with a little peak into the functioning of Dialogflow!

What does “Quality of training phrases” mean?

We know that a bot is usually designed to answer user queries and guide the user by providing them clarity on what they intend to know.

For example, if we consider a customer service bot or a simple FAQ bot, our aim would be to answer a user query as specific as possible, among the categories that we service.

This specificity that we want to achieve is analogous to “classification” in ML jargon, where we position our different ways of asking user queries(training phrases) in a clear “isolated zone” among all possible categories we have and respond accordingly.

This drills down to a multi-class classification ML model, where we classify the user query to intents (what the user intends to know), and once we classify reasonably, we will be as close to achieving our objective of fulfilling the user intent.

So the question still remains,

Is Dialogflow an opaque box?

And the answer now is No - because there is a way to traverse and see light at the end of the Dialogflow maze!

That leads us to the next important question -

How do we analyze the quality of training phrases?

We make use of the following key “concepts” to get to know the quality of our training phrases:

Cohesion:

A measure of proximity between each pair of training phrases in an intent, and computed for each intent.

The closer each pair of phrases are to each other in an intent, they can be classified into a single “class”, without any ambiguity, thus leading to a well “focused” intent.

This indicates that the higher the cohesion in an intent, the better the intent is classified.

Separation:

A measure of how close two intents are to each other, computed by the average distance between each pair of training phrases in the two different intents.

If training phrases from two different intents are close to each other, it means that there is no clear isolation established between the two intents, which results in ambiguity of classes.

This indicates that there must be significant separation between two intents.

Confusing Phrases:

The separation value calculated above, enables us to arrive at a measure, where,

If the training phrases between two intents are highly similar to each other(confusing phrases), then it will be difficult for the bot to classify the user query into classes and respond accordingly.

I get it, this can be confusing in theory, but can be understood more clearly, when we take a look at the visual representation of all the training phrases.

Let’s get to the visual representation then!

To get a visual representation of the training phrases of our intents, we need to convert into feature vectors (mathematical representation of our training phrases), which “digitizes” the sentence/ phrase semantically (a.k.a embeddings).

We make use of Tensorflow Hub, a library that provides reusable modules of machine learning models. These modules can be pre-trained models or embeddings, extracted from text, images etc.

For our case, we use Universal Sentence Encoder(v2) module, which can transform our text into 512 dimensional vectors and generate embeddings!

The Research: Step by Step:

We have gone through enough theory, so lets get into the practical part of it!

Since the focus of this research is to analyze the training phrases, lets concentrate on the analysis and try to understand its inference.

Step 1:

We have a Dialogflow agent with the relevant training phrases.

Step 2:

We have generated embeddings of our training phrases using the tf.Hub Universal Sentence Encoder pre-trained module.

Step 3:

Lets generate the visualization of embeddings in a two-dimensional space, for easier understanding.

Training Phrases marked in 2D space

Step 4:

Let’s classify the intents based on the representation we see.(We are not using any mathematical measure yet, but trying to infer through the visualization we have achieved.)

Visual Segregation of Training Phrases into classes

Quick Visual Inferences:

Quick inference from the plot above shows one training phrase from “Change Private Details” intent is close to another training phrase from “View Private Details” intent. => Separation seems to be low.

Also, we can see that training phrases in “Welcome Intent” are in good proximity with each other => Cohesion seems to be high.

It is however to be noted that we have still not entered into analyzing the mathematical measures, and yet the visual representation gives us a fairly good idea of how our training phrases and intents are classified!

Let’s dig further and check the cohesion and separation measures, to get to know the exact training phrase that needs to be paid attention to.

Step 5:

Similarity Metric:

Let’s check the similarity metric and the phrases that are too close(similar) to each other and also the phrases that are least close(non-similar) to each other.

It also gives the exact training phrase, so we can also find the “offenders” - confusing phrases (the phrases leading to ambiguity in intent classification).

Too similar and non similar phrases between different intents

Step 6:

Cohesion:

Let’s check mathematically how close the training phrases in an intent are close to each other.

A look at how close the training phrases in the same intent are

Step 7:

Separation:

Let’s check mathematically the distance between different intents.

A look at the Separation between the intents, which leads to classification of intents

With these intuitive and scientifically relevant steps, we are able to:

a) Understand the science behind the fundamental cognizance of our bot(Training Phrases,Classification of Intents)

b) Focus on only the relevant aspects to improve them(Similarity, Cohesion, Separation, Confusing Phrases), both visually and mathematically.

So now, to answer the question we started with -

Is Dialogflow an opaque box?

based on our research and analysis would be,

It is not an opaque box, but a translucent one!

I hope this research would have given a sneak peek of, and help someone on, how to go about improving the efficiency of training a Dialogflow bot scientifically rather than emotionally, and would serve as a flashlight to pierce through the maze.

PS: Credits are due for the amazing research here that served as a guiding star for me for this exploration.

As always, I am glad to answer any queries on my LinkedIn profile: Vishnu Priya VR.

and on my Twitter handle: Vishnu Priya VR.

--

--