Building an NLP Chatbot for a restaurant with Flask
Want to build a chatbot personalized to a particular business but have very little data, or don’t have time to go through the hassle of creating your business-specific data for tasks like intent classifications and named entity recognition? This blog is a solution to just that!
For a machine to completely understand the diverse ways a human could query something, and be able to respond in the natural language just how a human would! To me, it feels like almost everything that we would ever want to achieve through NLP. Hence, this is one application I have always been intrigued about.
A few weeks back, I finally set out to design my first NLP chatbot! Of course, I have deliberated (with myself, lol) on the nature of this chatbot — and I came to the profound decision (my face was stuffed with food and I was looking for desserts to order online) that my chatbot would serve a restaurant by chatting and assisting patrons.
Functionalities of the Chatbot:
- Greet
- Show menu
- Show offers available
- Show just vegetarian options if available
- Show vegan options if available
- Explain more about any particular Food item, giving details of its preparation and ingredients
- Assure customers about the COVID protocols and hygiene followed by the restaurant
- Tell the hours the restaurant is open
- Check if tables are available
- Book a table if available and give the customer a unique booking ID
- Suggest what to order
- Answer if asked if they are a bot or human
- Give contact information of the restaurant
- Give the address of the restaurant
- Take positive feedback, respond accordingly, and store it for the Restaurant management to check
- Take negative feedback, respond accordingly, and store it for the Restaurant management to check
- Respond to some general messages
- Bid goodbye
Final Outcome:
Please click on Full Screen button, and change the quality to HD from SD to see it clearly.
Overview:
Creation of embedded_dataset.json:
First we embed our dataset which will be used as an input in the chatbot. This is a one time job.
Overview of the whole architecture:
How to set up and run the project?
This is just for having the project up and running, I will explain the parts one by one deeper into the blog :)
1. Install Pre-requisites
My python version is 3.6.13.
To install all the required libraries, download/clone my GitHub repo and in the folder, open CMD and enter:
> pip install -r requirements.txt
This is the contents of the requirements.txt file.
numpy
nltk
tensorflow
tflearn
flask
sklearn
pymongo
fasttext
tsne
2. Download pre-trained FastText English model
Download cc.en.300.bin.gz from here. Unizip it to Download cc.en.300.bin, the code for which is helper scripts in my Github repo.
3. Prepare dataset
Run data_embedder.py This will take the dataset.json file and convert all the sentences to FastText Vectors.
> python data_embedder.py
4. Set up Mongo Db on localhost
Install MongoDb Compass
Make 3 collections: menu, bookings, feedback
Menu has to be hardcoded, since it is something specific to the restaurant, populate it with the food items the eatery would provide, their prices, etc. It includes item, cost, vegan, veg, about, offer. I made a small JSON file with the data and imported it in MongoDb Compass to populate the menu collection. You can find my menu data here.
One example document in menu:
feedback docs will be inserted when a user gives a feedback so that the restaurant authority can read them and take necessary action.
Example docs in the feedback collection:
booking collection writer the unique booking ID and time-stamp of booking, so that when the customer comes and shows the ID at the reception, the booking can be verified.
5. Run Flask
This will launch the web app on localhost
> export FLASK_APP=app
> export FLASK_ENV=development
> flask run
Implementation:
Our friendly little bot job has two major parts:
- Intent Classification Understand the intent of a message, ie, what is the customer querying for
- Conversation Design Design how the conversation would go, responding to the message as per the intent, with a Conversation design.
For example,
The user sends a message: “Please show me the vegetarian items on the menu?”
- The chatbot identifies the intent as “veg_enquiry”
- And then the chatbot acts accordingly, that is, queries the restaurant Db for vegetarian items, and communicates them to the user.
Now, let us go through it step by step.
1. Building Dataset
The dataset is a JSON file with three fields: tag, patterns, response, where we record a few possible messages with that intent, and some possible responses. For some of the intents, the responses are left empty, because they would require further action to determine the response. For example, for a query, “Are there any offers going on?” The bot would first have to check in the database if any offers are active and then respond accordingly.
The dataset looks like this:
{"intents": [
{"tag": "greeting",
"patterns": ["Hi", "Good morning!", "Hey! Good morning", "Hello there", "Greetings to you"],
"responses": ["Hello I'm Restrobot! How can I help you?", "Hi! I'm Restrobot. How may I assist you today?"]
},
{"tag": "book_table",
"patterns": ["Can I book a table?","I want to book a seat", "Can I book a seat?", "Could you help me book a table", "Can I reserve a seat?", "I need a reservation"],
"responses": [""]
},
{"tag": "goodbye",
"patterns": ["I will leave now","See you later", "Goodbye", "Leaving now, Bye", "Take care"],
"responses": ["It's been my pleasure serving you!", "Hope to see you again soon! Goodbye!"]
},
.
.
.
2. Normalising messages
The first step is to normalize the messages. In natural language, humans may say the same thing in many ways. When we normalize text, to reduce its randomness, bringing it closer to a predefined “standard”. This helps us to reduce the amount of different information that the computer has to deal with, and therefore improves efficiency. We take the following steps to normalize all texts, both messages on our dataset and the messages sent by customer:
- Convert all to lower case
- Remove punctuation
- Remove stopwords: Since the dataset is small, using NLTK stop words stripped it off many words that were important for this context. So I wrote a small script to get words and their frequencies in the whole document and manually selected inconsequential words to make this list
- Lemmatization: refers to doing things properly with the use of a vocabulary and morphological analysis of words, to remove inflectional endings only to return the base or dictionary form of a word.
3. Sentence Embedding:
We use FastText pre-trained English model cc.en.300.bin.gz, downloaded from here. We use the function get_sentence_vector() profited by fasttext library. How it works is, each word in the sentence is converted to FastText word vectors, each vector is divided with its norm (L2 norm) and then the average of only the vectors that have positive L2 norm value is taken.
After embedding the sentences in the dataset, I wrote them back into a json file called embedded_dataset.json and keep it for later use while running the chatbot.
3. Intent Classification:
The meaning of Intent classification is to be able to understand the Intention of a message, or what the customer is basically querying, i.e given a sentence/message, the bot should be able to box it into one of the pre-defined intents.
Intuition:
In our case, we have 18 intents that demand 18 different kinds of responses.
Now to achieve this with machine learning or deep learning techniques, we would require a lot of sentences, annotated with their corresponding intent tags. However, it was hard for me to generate such a large intent annotated dataset specific to a restaurant’s requirements, with the customized 18 labels. So I came up with my own solution for this.
I made a small dataset, with a few example messages for each of the 18 intents. Intuitively, all these messages, when converted to vectors with a word embedding model (I have used pre-trained FastText English model), and represented on a 2-D space should lie close to each other.
To validate my intuition, I took 6 such groups of sentences, plotted them into a TSNE graph. Here, I used K-means unsupervised clustering, and as expected, the sentences got clearly mapped into 6 distinct groups in the vector space:
The code for the TSNE visualization of sentences is here, I will not go into details of this code for this post.
Implementing intent classification:
Given a message, we need to identify which intent (sentence cluster) it is closest to. We find the closeness with cosine similarity.
Cosine Similarity is a metric used to measure how similar the documents (sentences/messages) are irrespective of their size. Mathematically, it measures the cosine of the angle between two vectors projected in a multi-dimensional space. The cosine similarity is advantageous because even if the two similar documents are far apart by the Euclidean distance (due to the size of the document), chances are they may still be oriented closer together. The smaller the angle, higher the cosine similarity.
Logic of finalizing on the intent is explained in comments in the detect_intent() function:
4. Database to hold restaurant info
Here we used pymongo to store the information of the restaurant. I created three collections:
1. menu has columns: item, cost, vegan, veg, about, offer -> app.py queries into it
2. feedback has columns: feedback_string, type -> docs are inserted into it by app.py
3. bookings: booking_id, booking_time -> docs are inserted into it by app.py
5. Generate response and Act as per message
In our dataset.json we have already kept a list of responses for some of the intents, in case of these intents, we just randomly choose the responses from the list. But in a number of intents, we have left the responses empty, in those cases, we would have to generate a response or do something as per the intent, by querying info from the database, creating unique ID for booking, checking the recipe of an item, etc.
6. Finally, integrate with Flask
We will be using AJAX for asynchronous transfer of data i.e you won’t have to reload your webpage every time you send an input to the model. The web application will respond to your inputs seamlessly. Let’s take a look at the HTML file.
The latest Flask is threaded by default, so if different users chat at the same time, the unique IDs will be unique across all instances, and common variables like seat_count will be shared.
In the JavaScript section we get the input from the user, send it to the “app.py” file where we generate response and then receive the output back to display it on the app.
Some Snapshots of this beauty we just built:
Conclusion
And that’s how we build a simple NLP chatbot with a very limited amount of data! This can obviously be improved a lot by adding various corner-cases and made more useful in real life. All the codes are open-sourced on my Github repo. If you come up with enhancements to this project, feel free to open an issue and contribute. I would love to review and merge your feature enhancements and attend to any issues on my Github!
At the cost of sounding silly, here’s me requesting you to give this post some claps if you have reached till here, and liked my effort :”)