Chatbots for Beginners: Understanding Rasa

Mehul Gupta

Published in

Data Science in your pocket

5 min readJul 20, 2020

Moving ahead from text summarization, its time for chatbots !!

Rasa has been a prominent framework in python assisting in building any chatbot from scratch !!

So this time, will be exploring the basics of Rasa & its major components.

Follow this for installation purpose:-

Installation

New episodes of the Rasa Masterclass are out now! Watch Now ✕ You can install Rasa Open Source using pip (requires…

rasa.com

Once done, run this command in a new directory (name the directory anything !!)

rasa init

I will be discussing the major files/components you will be able to see. The rest of the post deals with the role of each file.

Rasa has major 2 halves:

NLU & Core.

NLU deals with understanding the training data & interpreting the intent (core idea of the message)
Core handles the entire flow of conversation. For example: if an intent is detected, which action should be taken by the bot is decided by Core.

For a surprise, even Core is trainable i.e it also uses Machine Learning in its hood to decide on the actions given intents & not just if & else. Will be elaborating on the two major halves & their sub-components below:

Natural Language Understanding (NLU)

You can find a file data/nlu.md in your bot directory. nlu.md basically consists of training data for your chatbot.

But it isn’t something like a titanic dataset !!

Each sample can be treated as a combination of 2 things:

Intent
Corresponding examples

Intent represents the core idea of the message/sample text & examples constitute for those sample text/messages over which the model trains.

For example

## intent:bye
  - Bye !!
  - tata
  - see you soon## intent:greet
  - Hi
  - Hey
  - How you doing?

Here,

Intent=bye & examples are the below 3 sentences. It must be noted that nlu.md has many such intent+example combinations like greet (Hi, hey), thank(thank you, thanks for the help), etc.

NLU comprises of just nlu.md !!

The rest of the components belongs to Core

Stories

They can be found in data/stories.md.

Stories is a combination of Intent + corresponding action the bot should take. Hence, if a certain intent is captured(after training over nlu.md), the bot goes to stories.md & performs the action mapped against it

## story1
* bye
  - utter_bye

Here, story1 is the name of the story (can be anything, not of much relevance), bye represent the intent & utter_bye is the name of the action to be performed.

But where is utter_bye?

But before that, We must know what are

Actions

Actions are the responses our bot gives after

It understands user intent from his/her message
Finds a mapping in stories.md (not a necessity for triggers, advanced topic & can be left for now)

Actions can be of 2 types:

Just text message replies( Utter actions). Such action names start with ‘utter_name-of-action’. Something similar to utter_bye. Mostly mentioned in the ‘responses’ section of domain.yml.
Some actions like calculations, initiating a form, etc. Such action names start with ‘action_name-of-action’. Something like action_start-form which might initiate a form with name & query of the user if the user says: ‘need to register a query’. Mostly mentioned in the ‘actions’ section of the domain.yml

But, what is domain.yml?

Domain

Domain can be taken as a parent in this whole ecosystem of the bot with the following information:

Intents (only names)defined in nlu.md
Actions (only names)
Responses (utter action names mapped with text to be uttered to the user)
Entities: Variable names to store values.

intents:
  - greet
  - bye
  - thankentities:
  - name
  - queryresponses:
   utter_greet:
    - text: Hi
  utter_bye:
    - text: Bye!actions:
   - action_fill-form

Note: In the responses section, only utter actions can be mentioned. Though, utter actions can be mentions in ‘actions’ section as well. Read this slowly!!

1.Responses help us to providing a definition for utter_actions only. In the above example, ‘utter_bye’ is the action name & the next line is the corresponding text response by the bot

2. Actions consist of names of actions which are

Custom actions other than utter actions
Utter actions if we wish to define them in actions.py.

Why two different categories i.e Actions & Responses serving almost the same purpose?

When an action is mentioned in ‘actions’ sections, its definition has to be defined in actions.py file using Rasa SDK (python code).
So, when the bot is just required to respond using a text (& no other logic or activities), rasa provides a simpler way to provide action definitions in the ‘responses’ section. Though, such actions can be defined in actions.py using Rasa SDK as well. Though, then, their definition should go in the ‘actions’ section.

3. Entities are the variables that help in storing values for variables we get from user messages. Consider the case where the user wishes to register an inquiry:

User utters ‘I wish to register a query’
After analyzing the intent(say register inquiry in nlu.md), the bot starts a form to be filled

Now, the user provides us with the information (name & query) which has to be stored in variables for which entities are used.

Entity extraction can be done adding some entity extraction featurizers in the Pipeline in config.yml.

There is a lot to entities & hence, will be covering it in my next post!!

Endpoints

To use actions.py file (for custom actions), rasa needs an endpoint where custom action is predicted. This endpoint can be mentioned in the endpoints.yml file.

Last, but still an important part:

Config

It is mentioned in config.yml. Config.yml has 3 major parts:

en: The language over which bot has to be trained (English, Spanish, etc.)
Pipeline: This pipeline consist of a number of steps followed for preprocessing text received from the user. This can include various tokenizers, different featurizers (extracting features from text), entity extractors, etc.
Policies: Policies help the bot to decide which action should be chosen once the intent has been extracted. Many policies can be mentioned at once considering different circumstances. For example: The memoization policy helps the bot to decide a response from past ‘n’ user messages where ’n’ is a hyperparameter.

Also,

Once all the setup is done, both nlu & core parts are trained & the trained models go in models/.
tests/ consists of conversation_tests.md where we can mention test cases to check whether our bot is performing well or not
results/ consists of the report for the cases the bot failed mentioned in conversation_tests.md

So cutting short, we would be mostly interacting with 5 files:

nlu.md, stories.md, domain.yml, config.yml & actions.py

All sorts of codes can be found on rasa documentation

I guess this much of an understanding is enough, to begin with, a bot using rasa. Will try to go through a basic chatbot code next !!