Contextual Conversational Engine— The Rasa Core Approach — Part 1

I have been really active in the Rasa community for the past 10 months or so and been experimenting quite a bit with Rasa Core . While traditional chatbot development over the last 2 years have been highly focused on making the bot understand really what you mean, little has been done using a data science approach towards conversation management which has been and rightly so a very logic driven machine so far meaning given a bot understands the user’s intention X, the machine must do Y. However real life conversations are a tad bit complicated than a straightforward logic. Of course from a process point of view, this logic holds since it doesn’t matter whether you go to a branch or fill up a form online, if you want to open a bank account — There is a process.


What value does Rasa core brings?

I guess one thing i should make clear — I am no expert when it comes to understanding Rasa core however given the time i have spent on it, i will pretend that I have my expertise on it 😆

So let’s talk about value

The data science approach towards conversation management isn’t going to change radically an existing process — If you want a freaking pizza, you MUST tell a chatbot what type of base, what toppings you want and where you want it delivered.

But let’s be clever about it, there are always exceptions like in the case of pizza — ALLERGIES !! something unexpected. Sure you can define a logic around but how many such logic are you going to code each day.

Keep in mind, Rasa core is not changing your process neither the machine is generating responses, it just allows you to handle exceptions better. Your rules still rule 📏

The Conversation

I would like to define a simple conversation first before we start the experiment about the different elements. My goal in the end will be to showcase the inner workings of Rasa core and how to best utilise Rasa core for a particular conversation your chatbot might have. The focus will be solely on Dialogue Management and not at all on NLU( Natural Language Understanding) — my assumption here is that your machine understands Natural Language perfectly.

Okay, in order to make a proper assumption, we will need a conversation equivalent to a similar state of an app or a human.

We start with a business case here and since in the past, i have worked a bit with the restaurant industry and my love for cooking

Let’s get your breakfast sorted!! 🥘

We will make a chatbot that is able to take your food order. The conversation is between a user ordering takeout.

Let’s build the different states

Ordering a takeout

As you can already see, a simple ordering conversation is quite complicated already, here i have covered one of the exceptions, where a given item is not available , the bot can suggest some alternatives. You will soon find out there are many such alternatives and how do we deal with it

Rasa Core Architecture

Simply here I will describe the architecture of how Rasa core trains and how Rasa core is parsing a particular incoming text. Again the assumption is that there is already an NLU implementation in place since for Rasa core what is useful is actually are the features to predict the next step in the conversation which is not the natural text but rather the intent of the incoming text

Training Process

Training Steps

Let’s dissect each step to see what is really going on for each of them. Keep in mind each of these components are well defined in the Rasa core documentation — https://rasa.com/docs/core/

I am merely presenting each one of them through my own opinion just to ensure when i do the experimentation on the above business case, these words are relatable

  • Training Data — This is essentially all the stories where you typically define what is a normal conversation for your process, you define this in a particular rasa core format. You will find information about the format here — https://rasa.com/docs/core/stories/
    If you check the link- you will find some important elements there in the stories
    Let’s take an example and look at it each of them
## story_07715946 <! — name of the story — just for debugging 
* greet ( Intent of the user)
— action_ask_howcanhelp ( What the bot should do)
* inform{“location”: “rome”, “price”: “cheap”} <! — user utterance, in format intent{entities} → Next step of the conversation
— action_on_it
— action_ask_cuisine
* inform{“cuisine”: “spanish”}
— action_ask_numpeople <! — action that the bot should execute
* inform{“people”: “six”}
— action_ack_dosearch
  • Domain — This is the heart of the chatbot. Domain basically determines what your chatbot should understand, what the chatbot can do and what kind of information is necessary for your chatbot’s context so it understands the user better. More info on the domain format is here: https://rasa.com/docs/core/domains/
  • Load Agent — I wanted to start with first loading the agent here before explaining loading the data because the Agent(or the bot) is first loaded with some parameters that determines how the training data will be converted into features for training the agent. One really important parameter is
    - Policy ( explained below)
  • Policy — This is the core model architecture of the bot, A policy is what will define what is going to be the next action. As Rasa core is open-source, you can indeed create your own policy but let’s get the basics right and see what are the already available policies that are used by default. You will have more information here — https://rasa.com/docs/core/policies/ 
    - Memoization Policy — Though the name sounds really something that came out straight from a research paper, the core idea for this policy or let’s say algorithm is to copy your training data and remember them by heart and then predict the next action. Here the predict is binary, if the conversation matches one of the stories in your training data then the next action will have a confidence of 1, if it doesn’t then the confidence is 0. However how back in the conversation, this match needs to be check will depend on Max History( mentioned below). Note: There is also an AugmentedMemoizationPolicy — you can use this instead if you don’t have slots set in prediction but they are present in your training data, basically this helps for the Memoization Policy to disregard slots in the tracker to consider a decision
    - Keras Policy — One of the Machine learning model is a Recurrent Neural Network (LSTM) that takes in a bunch of features to predict the next possible action. We will learn about the featurization below
    -Fallback Policy — This is a straight forward functional logic that takes three parameters 
     NLU threshold, Core threshold and fallback action. If the NLU threshold is below a given percentage , fallback action will be called. However if the NLU threshold is okay but the given intent is not present in the domain, it will call the fallback action. 
    Core threshold is the second fallback where the bot understood the intent very well but is unsure about the action predicted and it is below the Core threshold, then fallback action is called.
  • Featurization — I won’t explain much here since the featurization is quite well documented in the docs of rasa core: https://rasa.com/docs/core/api/featurizer/
    The basic idea for any model to work specially a neural network, is that the network needs to be fed with a bunch of features that determines the next action in the conversation, these features are generally vector representation of the conversation, so Keras policy for example — each state of the conversation will contain a bunch of features such as intents, entities, slots and previous actions. The total number of states for which every feature is determined is then fed to the network that predicts a label(one of your actions in the list). My advice read about the SingleStateFeaturizer really well.
  • Parameters — In order to feed the policies with a bunch of conversation features, there is one hyperparameter plays a really important role that is max_history
    - MaxHistory — max_history basically provides the SingleStateFeaturizer a value which determines the number of feature set that is needed to be fed to the neural network. It is also useful for story generation we will talk about below. In order for the network to provide an output, we will need to figure how much in the past of the conversation does the policy needs to look back in order to determine what to do next in a conversation. By default Max History is set at 5, so let’s say for Memoization policy which is using SingleStateFeaturizer with a max_hisory of 5, will look back 5 steps in your tracker to determine if the next step matches with any stories in your training data. Similarly, for Keras policy, max history also increases the total number of features that is driving the conversation.
  • Note: Another important feature are the slots — You can read about it here: https://rasa.com/docs/core/slots/
    Not much to deep dive in this section. Just remember, unlike the max history parameter, slots are usually unaffected for the whole history of the conversation and slots can drive a conversation from state A to state B. It is important to carefully asses the type of slot you would like since they are featurised differently. More info in the documentation of Rasa.
  • Load Data — Once you have your Domain and your training data in Rasa Core format, it is time to load the data, there are a few more parameters which are important in this case.
    - Augmentation Factor — This parameters upon loading the stories, will start randomly glueing stories together to create longer stories. If you want that there is no augmentation of your training data , you can skip this by setting it to 0 upon training. By default , it is 20.
  • Training — Policies works in an ensemble, meaning you can pass more than one policy to the Agent-by default Memoization and Keras policies are fed for the training process, each of them trains separately, however they are used together in an ensemble to predict the next action and only one wins the battle. More on that later.
  • Persist — Once you have trained your model, it is then persisted to a file system or a cloud storage of your choice

Prediction Process

After you have a trained model, let’s deep dive into how prediction is made

Parsing in Rasa core

Load Model — Let’s start first by loading the model in memory before serving using a Server( Flask ) and exposing an endpoint related to a particular channel, in our case we will deal with a REST API.

Interpreter — As i have explained before, the assumption is that we have a running interpreter that is able to read the raw text coming in from user and throw out the intention of the user along with some meaning entities, these entities which we are saving as slots that will drive the conversation.

CreateOrUpdate Tracker — If this is the first message of the conversation, rasa core will create a tracker object with the key — sender_id which is the incoming identifier of the user, make sure this id is unique to one user otherwise the prediction might not be coherent for a particular human being. 
Tracker object usually contains what is found out by the interpreter and/or if there are new slots that are set through an API call such as a user’s birthday.Tracker object is usually stored in a tracker_store which is by default is in InMemory, however upon industrialisation you would like to scale your API and it is important that the tracker_store is externalised either to MongoTrackerStore or RedisTrackerStore. You can check my other tutorial as i have created a RedisTrackerStore for a multilingual chatbot. For this experiment, since i am not industrialising the chatbot- i will stick with the InMemoryTracker

Features — Once you have the tracker object, you will now need to featurize what is present in the tracker based on the policy we have set upon training and the necessary parameters such as the max_history, once these features are generated, it will given to the policy ensemble which will determine the final outcome.

PolicyEnsemble — Since, we have trained different policies, when it comes to predicting the next action, each of these policies will provide a score for the particular action. Then there is a max taken from all scores given by every policy and whichever wins, will be the next action

Update Tracker — Once you have the action predicted, you will need to update the tracker for the next turn

ExecuteAction — Now you will be finally executing your action, be it an API call or a message sent back to the user

Phew!! so much information,I hope this does help clarify the fundamentals in Rasa core. In Part 2 we will apply this concept to the above conversation using different variations of policies and see how it reacts