Chatbot Development using Deep Learning & NLP implementing Seq2Seq Model

Published in

Analytics Vidhya

7 min readOct 22, 2019

Banner vector created by full vector — www.freepik.com

Note: In this post, I’ll only summarize the steps to building the chatbot. All the files related to this project with complete code have been uploaded to my github profile for which the link is given below, to see the complete code, you can clone or download the repository. The chatbot.py file contains lots of comments explaining what’s and why’s of the code. Therefore, you won’t really have trouble understanding any of it. Any other doubts will be gladly answered.
https://github.com/adi2381/ai-chatbot

Before we start with this post, we need to understand a few things in simple words such as:

What is a chatbot?

A chatbot can be defined as an application of artificial intelligence that carries out a conversation with a human being via auditory or textual or means. Examples of auditory chatbots can be Samsung Bixby or Apple’s Siri and textual can be the telegram or messenger chatbots.

Why use a chatbot?

Let’s take an example to explain it.

Let’s say you work in customer support for an eCommerce company, and you chat with a customer who asks a question as simple as “Where is my order?”, now you’d manually check it on the system and then reply to the user. This is just one instance. However, customer support executives have to face the same question a million times in a day and answering the same thing over and over again just seems redundant as in getting a query, look in the database and reply to the user. If you think about it, the same can be easily achieved by a chatbot who can receive the query, look into the database and reply back and hence the manpower resource can be allocated to someplace else where they’d be more useful. So, automation through chatbots in such scenarios is beneficial.

Where are they used?

They are mainly used in dialog systems for various purposes such as providing customer services or achieve information. Chatbots can be developed and implemented for any specific field, such as in eCommerce, you see it mainly providing support regarding the orders, payments, etc.

What is the problem space associated with the chatbots?

The main task of the chatbot is to determine the best response for any query or message it receives from the user. This best response should be the following:

Direct & Accurate answer to the user’s query

2. Provide relevant information related to the query

3. If the query is vague, ask follow-up questions to get more insight into the query and hence generate the best possible response

What is our approach?

We used deep learning and in addition to that, the concept of NLP (Natural Language Processing), RNN (Recurrent Neural Networks) & LSTM (Long-Short-Term-Memory) and finally made use of the seq2seq model architecture to implement the chatbot.

Getting Started: Anaconda

Install Anaconda

Anaconda Python/R Distribution — Free Download

The open-source Anaconda Distribution is the easiest way to perform Python/R data science and machine learning on…

www.anaconda.com

2. Install Virtual Environment using Anaconda Shell

Go to Start > Anaconda3 > Anaconda Prompt
Run the following line of code to create a virtual environment. -n is used to define the name of the virtual env and python 3.5 is installed, so make sure you install the same to avoid any issues.

conda create -n chatbot python=3.5 anaconda

3. Activate Virtual Environment & Install Tensorflow

In Anaconda Navigator, go to Environments which you can on find on the left side-bar
You’ll find you virtual environment there along with the root virtual environment which belongs to Anaconda, click on your environment > right-click > open with python terminal and execute the following code

Activate nameofyourvirtualenvironment

Once the virtual env is activated, install strictly TensorFlow version 1.0.0 because specific functions such as tf.reset_default_graph are deprecated in the later versions.

pip install tensorflow==1.0.0

4. Download the Cornell Movie-Dialogs Corpus Dataset

Cornell Movie-Dialogs Corpus

This corpus contains a large metadata-rich collection of fictional conversations extracted from raw movie scripts: …

www.cs.cornell.edu

Getting Started: Spyder

Launch Spyder IDE through the Anaconda Navigator
Wherever you downloaded the dataset, I’m assuming you created a folder on the desktop named chatbot, so copy the dataset folder to the chatbot and then specifically from the dataset folder take out Movie_lines & Movie_conversations text file and paste it in the chatbot folder.
Through your, spyder IDE navigate to your directory through the explorer window in the top right along with which you’ll find other options such as variable explorer, etc. Also, save the python file to this chatbot directory as well.

Building the Chatbot

Phase — I consists of importing the necessary libraries, importing the dataset and preprocessing the dataset for the training phase of our model. Following are the steps are taken to do the data preprocessing:

### Phase 1: Data Preprocessing ###  # Importing Dataset
  # Creating a dictionary that maps each line with its id
  # Creating a list of all of the conversations
  # Getting questions and answers seperately
  # Simplifying and cleaning the text using Regular Expressions
  # Cleaning questions
  # Cleaning answers
  # Filtering out the questions and answers that are too short or too long
  # Creating a dictionary that maps each word to its number of occurrences
  # Creating two dictionaries that map the words in the questions and the answers to a unique integer
  # Adding the last tokens to above two dictionaries
  # Creating the inverse dictionary of the answers_words_to_int dictionary
  # Adding the <EOS> token to the end of every answer
  # Translating all the questions and the answers into int &    replacing all the words that were filtered out by <OUT> token
  # Sorting questions and answers by the length of questions

Phase — II consists of Building the seq2seq model for implementing our chatbot. Following concepts are used in this phase:

Recurrent Neural Network
LSTM (Long-Short-Term-Memory)
Seq2Seq Model
Optimization Techniques — Beam Search Decoder and Attention Mechanism

### Phase 2: Building SEQ2SEQ Model ###  # Creating placeholders for the inputs and the targets
  # Preprocessing the targets
  # Creating the Encoder RNN
  # Decoding the training set
  # Decoding the test/validation set
  # Creating the Decoder RNN
  # Building the seq2seq model

Phase — III consists of training the seq2seq model we built in the previous phase. Following concepts are used in this phase:

Hyperparameters
Optimization Technique — ADAM Optimizer

### Phase 3: Training the SEQ2SEQ Model ###  # Setting the Hyperparameters
  # Defining a session
  # Loading the model inputs
  # Setting the sequence length
  # Getting the shape of the inputs tensor
  # Getting the training and test predictions
  # Setting up the Loss Error, the Optimizer and Gradient Clipping
  # Padding the sequences with the <PAD> token
  # Splitting the data into batches of questions and answers
  # Splitting the questions and answers into training and validation sets
  # Training

Phase — IV consists of setting up the chatbot which is followed by the training phase.

### Phase 4: Testing The Seq2Seq Model ###
  
  # Loading the weights and Running the session
  # Converting the questions from strings to lists of encoding integers
  # Setting up the chat

Results

My first session with the chatbot second session with the chatbot

As you can see, the replies are not appropriate. The reason for that is the lack of training. We divided our dataset of 4200 examples into batches of 100, therefore, it took 42 iterations to complete one epoch. So, I trained the model for 15 epochs and the result is in front of you, it wasn’t enough. Perhaps, the training phase can be improved by further fine-tuning the hyperparameters or simply training it for more epochs.

For training the model, I Used Google Colab because it provides you with an Nvidia Tesla K80 along with 12GB of RAM in a virtual environment for 12 hours maximum for FREE. On how to train your model in google colab, I’ve provided the steps in my GitHub repo.

adi2381/ai-chatbot

This is an attempt at building a ChatBot using the Seq2Seq model. This model is based on 2 LSTM Layers. Seq2Seq mainly…

github.com

Conclusion

Chatbots are an essential application of artificial intelligence and the chatbot industry is booming at the moment with top companies making use of chatbots in their latest devices such as Alexa by Amazon or Bixby by Samsung. The field of chatbot remains challenging as in how to improve the answers and choosing the best model that churns out the most appropriate answer based on the query and so on. In this post, we only tried one of the variants of seq2seq architecture coupled with specific optimization techniques to build our chatbot out of the many available. I hope you enjoyed the post and it piqued your interest in chatbots.

Shoutout

This section is dedicated to amazing tutorials and articles on chatbot development out on the internet which helped me out a lot in building the bot. I specifically followed these:

* Ultimate Guide to Leveraging NLP & Machine Learning for your Chatbot — Stefan Kojouharov
(https://chatbotslife.com/ultimate-guide-to-leveraging-nlp-machine-learning-for-you-chatbot-531ff2dd870c)
* seq2seq model in Machine Learning — Mani Wadhwa(GeeksforGeeks)
(https://www.geeksforgeeks.org/seq2seq-model-in-machine-learning/)
* Python Chat Bot Tutorial — Chatbot with Deep Learning — Tech With Tim
(https://www.youtube.com/watch?v=wypVcNIH6D4)