Understanding Rasa: A look under the hood of Rasa NLU

Gaurav G
4 min readJul 7, 2018
Rasa Stack

Natural Language Processing is a vast and interesting subject. Over the past few years, there have been massive developments and strides in the industry. Platforms like Google’s Dialogflow (API.ai), Facebook’s wit.ai, SAP’s recast.ai have made chatbot development much easier. Among these platforms is one major open source platform that provides greater flexibility and control and that is Rasa.

What is Rasa?

Rasa is a conversational AI framework that helps build your very own chatbot. Rasa provides a wide variety of options including greater control of data as well as customized NLP pipelines.At it’s core, Rasa includes

  • NLU: A Natural Language Understanding system that determines what the user wants and captures key contextual information
  • Core: selects the next best response or action based on conversation history

For this article, we are mainly going to have a look at the process of Rasa NLU. Rasa NLU is a tool for understanding what is being said in short pieces of text. For example, by taking a short text message like:

“I’m looking for a Chinese restaurant in my area”

It returns structured data like:

intent: search_restaurant
entities:
- cuisine : Chinese
- location : my area

Beginning at the Start

First, we looked at the command that should be executed to start training of our model:

python -m rasa_nlu.train \
--config sample_configs/config_spacy.yml \
--data data/examples/rasa/demo-rasa.json \
--path projects

This led us to the train.py in the rasa_nlu folder. Thats where all the magic happens (or atleast the training in this case). Digging deeper, we found the do_train() function, which is called on executing the above command. It takes the config, data and path specified during execution.

The Magic function

Trainer

Now, in the do_train() function, we see that the Trainer model is initially called.

trainer = Trainer(cfg, component_builder)

Two arguments are passed to the Trainer model in model.py:

  • config (sample_configs/config_spacy.yml) to use for the training, and
  • component_builder, which is the pipeline specification. In our case, we had not specified the component_builder.

The Trainer basically loads the config and uses the component_builder to build a trainer model. As we had not specified the component_builder, the default spaCy component pipeline is loaded for us because of the specified spaCy config.

NLP Pipeline. Source: Adam Geitgey

Coreference resolution is an optional step that isn’t always done.

Persistor

The next function is the persistor, which functions to store the data. You can specify AWS, GC, Azure or None. This is optional and we did not specify the same.

persistor = create_persistor(storage)

Training Data

Next, the training data can be taken from either the url or from local. We had specified the data in local with our command (data/examples/rasa/demo-rasa.json).

if url is not None:
training_data = load_data_from_url(url, cfg.language)
else:
training_data = load_data(data, cfg.language)

Interpreter

This data is sent to the trainer, we had created above for training and generating an interpreter.

interpreter = trainer.train(training_data, **kwargs)

The train() function in the Trainer model gets the training_data and iterates through it to generate a spaCy model. This is returned as the interpertor, which can interpret the input data and give structured data output.

Persisted Path

If path is provided, as in our case (projects) data is persisted to the path.

The Result

And finally, the do_train() function returns the trainer, the interpreter and the persisted_path. Now, we can make use of the generated interpreter for getting intents and the entities. This can further be used for Natural Language Understanding, which is what is used for NLP in Rasa.

Lightsaber locked and loaded

Equipped with this knowledge of how a trainer works in an NLP framework, I began my journey deeper into chatbot development. I do have more questions than when I started, but at least now I know where to look and what direction to take. Starting at the end does have it’s perks. This way I have a lot of questions, which will help me direct my focus towards a path that will take you in deeper.

Hope this article helped you also get started or enhanced your experience with Rasa & it’s NLU Engine. Do feel free to drop any queries that you have below.

--

--

Gaurav G

Creative Tech Officer @ Coffee | GA @ SU Chennai | Author, Nuggets of Wisdom