Building an NLP Chatbot for a restaurant with Flask

Aindriya Barua (They/She)
9 min readNov 20, 2021
Graphic design by author

Want to build a chatbot personalized to a particular business but have very little data, or don’t have time to go through the hassle of creating your business-specific data for tasks like intent classifications and named entity recognition? This blog is a solution to just that!

For a machine to completely understand the diverse ways a human could query something, and be able to respond in the natural language just how a human would! To me, it feels like almost everything that we would ever want to achieve through NLP. Hence, this is one application I have always been intrigued about.

A few weeks back, I finally set out to design my first NLP chatbot! Of course, I have deliberated (with myself, lol) on the nature of this chatbot — and I came to the profound decision (my face was stuffed with food and I was looking for desserts to order online) that my chatbot would serve a restaurant by chatting and assisting patrons.

Functionalities of the Chatbot:

  1. Greet
  2. Show menu
  3. Show offers available
  4. Show just vegetarian options if available
  5. Show vegan options if available
  6. Explain more about any particular Food item, giving details of its preparation and ingredients
  7. Assure customers about the COVID protocols and hygiene followed by the restaurant
  8. Tell the hours the restaurant is open
  9. Check if tables are available
  10. Book a table if available and give the customer a unique booking ID
  11. Suggest what to order
  12. Answer if asked if they are a bot or human
  13. Give contact information of the restaurant
  14. Give the address of the restaurant
  15. Take positive feedback, respond accordingly, and store it for the Restaurant management to check
  16. Take negative feedback, respond accordingly, and store it for the Restaurant management to check
  17. Respond to some general messages
  18. Bid goodbye

Final Outcome:

Please click on Full Screen button, and change the quality to HD from SD to see it clearly.

Flask app demo

Overview:

Creation of embedded_dataset.json:

First we embed our dataset which will be used as an input in the chatbot. This is a one time job.

Overview of the whole architecture:

How to set up and run the project?

This is just for having the project up and running, I will explain the parts one by one deeper into the blog :)

1. Install Pre-requisites

My python version is 3.6.13.

To install all the required libraries, download/clone my GitHub repo and in the folder, open CMD and enter:

> pip install -r requirements.txt

This is the contents of the requirements.txt file.

numpy
nltk
tensorflow
tflearn
flask
sklearn
pymongo
fasttext
tsne

2. Download pre-trained FastText English model

Download cc.en.300.bin.gz from here. Unizip it to Download cc.en.300.bin, the code for which is helper scripts in my Github repo.

3. Prepare dataset

Run data_embedder.py This will take the dataset.json file and convert all the sentences to FastText Vectors.

> python data_embedder.py

4. Set up Mongo Db on localhost

Install MongoDb Compass

Make 3 collections: menu, bookings, feedback

Menu has to be hardcoded, since it is something specific to the restaurant, populate it with the food items the eatery would provide, their prices, etc. It includes item, cost, vegan, veg, about, offer. I made a small JSON file with the data and imported it in MongoDb Compass to populate the menu collection. You can find my menu data here.

One example document in menu:

feedback docs will be inserted when a user gives a feedback so that the restaurant authority can read them and take necessary action.

Example docs in the feedback collection:

booking collection writer the unique booking ID and time-stamp of booking, so that when the customer comes and shows the ID at the reception, the booking can be verified.

5. Run Flask

This will launch the web app on localhost

> export FLASK_APP=app
> export FLASK_ENV=development
> flask run

Implementation:

Our friendly little bot job has two major parts:

  1. Intent Classification Understand the intent of a message, ie, what is the customer querying for
  2. Conversation Design Design how the conversation would go, responding to the message as per the intent, with a Conversation design.

For example,

The user sends a message: “Please show me the vegetarian items on the menu?”

  1. The chatbot identifies the intent as “veg_enquiry”
  2. And then the chatbot acts accordingly, that is, queries the restaurant Db for vegetarian items, and communicates them to the user.

Now, let us go through it step by step.

1. Building Dataset

The dataset is a JSON file with three fields: tag, patterns, response, where we record a few possible messages with that intent, and some possible responses. For some of the intents, the responses are left empty, because they would require further action to determine the response. For example, for a query, “Are there any offers going on?” The bot would first have to check in the database if any offers are active and then respond accordingly.

The dataset looks like this:

{"intents": [
{"tag": "greeting",
"patterns": ["Hi", "Good morning!", "Hey! Good morning", "Hello there", "Greetings to you"],
"responses": ["Hello I'm Restrobot! How can I help you?", "Hi! I'm Restrobot. How may I assist you today?"]
},
{"tag": "book_table",
"patterns": ["Can I book a table?","I want to book a seat", "Can I book a seat?", "Could you help me book a table", "Can I reserve a seat?", "I need a reservation"],
"responses": [""]
},
{"tag": "goodbye",
"patterns": ["I will leave now","See you later", "Goodbye", "Leaving now, Bye", "Take care"],
"responses": ["It's been my pleasure serving you!", "Hope to see you again soon! Goodbye!"]
},
.
.
.

2. Normalising messages

The first step is to normalize the messages. In natural language, humans may say the same thing in many ways. When we normalize text, to reduce its randomness, bringing it closer to a predefined “standard”. This helps us to reduce the amount of different information that the computer has to deal with, and therefore improves efficiency. We take the following steps to normalize all texts, both messages on our dataset and the messages sent by customer:

  1. Convert all to lower case
  2. Remove punctuation
  3. Remove stopwords: Since the dataset is small, using NLTK stop words stripped it off many words that were important for this context. So I wrote a small script to get words and their frequencies in the whole document and manually selected inconsequential words to make this list
  4. Lemmatization: refers to doing things properly with the use of a vocabulary and morphological analysis of words, to remove inflectional endings only to return the base or dictionary form of a word.

3. Sentence Embedding:

We use FastText pre-trained English model cc.en.300.bin.gz, downloaded from here. We use the function get_sentence_vector() profited by fasttext library. How it works is, each word in the sentence is converted to FastText word vectors, each vector is divided with its norm (L2 norm) and then the average of only the vectors that have positive L2 norm value is taken.

After embedding the sentences in the dataset, I wrote them back into a json file called embedded_dataset.json and keep it for later use while running the chatbot.

3. Intent Classification:

The meaning of Intent classification is to be able to understand the Intention of a message, or what the customer is basically querying, i.e given a sentence/message, the bot should be able to box it into one of the pre-defined intents.

Intuition:

In our case, we have 18 intents that demand 18 different kinds of responses.

Now to achieve this with machine learning or deep learning techniques, we would require a lot of sentences, annotated with their corresponding intent tags. However, it was hard for me to generate such a large intent annotated dataset specific to a restaurant’s requirements, with the customized 18 labels. So I came up with my own solution for this.

I made a small dataset, with a few example messages for each of the 18 intents. Intuitively, all these messages, when converted to vectors with a word embedding model (I have used pre-trained FastText English model), and represented on a 2-D space should lie close to each other.

To validate my intuition, I took 6 such groups of sentences, plotted them into a TSNE graph. Here, I used K-means unsupervised clustering, and as expected, the sentences got clearly mapped into 6 distinct groups in the vector space:

The code for the TSNE visualization of sentences is here, I will not go into details of this code for this post.

Implementing intent classification:

Given a message, we need to identify which intent (sentence cluster) it is closest to. We find the closeness with cosine similarity.

Cosine Similarity is a metric used to measure how similar the documents (sentences/messages) are irrespective of their size. Mathematically, it measures the cosine of the angle between two vectors projected in a multi-dimensional space. The cosine similarity is advantageous because even if the two similar documents are far apart by the Euclidean distance (due to the size of the document), chances are they may still be oriented closer together. The smaller the angle, higher the cosine similarity.

Logic of finalizing on the intent is explained in comments in the detect_intent() function:

4. Database to hold restaurant info

Here we used pymongo to store the information of the restaurant. I created three collections:

1. menu has columns: item, cost, vegan, veg, about, offer -> app.py queries into it

2. feedback has columns: feedback_string, type -> docs are inserted into it by app.py

3. bookings: booking_id, booking_time -> docs are inserted into it by app.py

5. Generate response and Act as per message

In our dataset.json we have already kept a list of responses for some of the intents, in case of these intents, we just randomly choose the responses from the list. But in a number of intents, we have left the responses empty, in those cases, we would have to generate a response or do something as per the intent, by querying info from the database, creating unique ID for booking, checking the recipe of an item, etc.

6. Finally, integrate with Flask

We will be using AJAX for asynchronous transfer of data i.e you won’t have to reload your webpage every time you send an input to the model. The web application will respond to your inputs seamlessly. Let’s take a look at the HTML file.

The latest Flask is threaded by default, so if different users chat at the same time, the unique IDs will be unique across all instances, and common variables like seat_count will be shared.

In the JavaScript section we get the input from the user, send it to the “app.py” file where we generate response and then receive the output back to display it on the app.

Some Snapshots of this beauty we just built:

Conclusion

And that’s how we build a simple NLP chatbot with a very limited amount of data! This can obviously be improved a lot by adding various corner-cases and made more useful in real life. All the codes are open-sourced on my Github repo. If you come up with enhancements to this project, feel free to open an issue and contribute. I would love to review and merge your feature enhancements and attend to any issues on my Github!

At the cost of sounding silly, here’s me requesting you to give this post some claps if you have reached till here, and liked my effort :”)

--

--