Chatbot with TensorFlow 2.0 — Going Merry

Ashwin Prasad
Analytics Vidhya
Published in
5 min readOct 22, 2020

Note: All the code files will be available at : https://github.com/ashwinhprasad/Chatbot-GoingMerry

Going Merry is a chatbot that I created for a pirate recruitment process. It helps in recruitment of pirates all around the world. this answer user’s simple questions regarding the recruitment process, pre-requisites, etc.This same model can also be used for creating chatbots for any organization

Introduction

A chatbot is a software application used to conduct an on-line chat conversation via text . In this blog post, I will show how to create a Simple Chatbot with tensorflow 2 for your organization.

Dataset Preparation

once, the dataset is built . half the work is already done. the way we structure the dataset is the main thing in chatbot. I have used a json file to create a the dataset.

json files are just like dictionaries in python. you can store in json files just like you do in dictionaries in python

Input: These are exactly the messages that the user is going to be sending to the bot.

tags : tags are used to categorise the inputs and map them to a particular type of response

responses : once, we have mapped an input to an appropriate tag, we can select one of the response to give back to the user.

This basically how the dataset is structured for the chatbot.

Machine Learning Part :

  1. Importing the Libraries
#importing the libraries
import tensorflow as tf
import numpy as np
import pandas as pd
import json
import nltk
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.layers import Input, Embedding, LSTM , Dense,GlobalMaxPooling1D,Flatten
from tensorflow.keras.models import Model
import matplotlib.pyplot as plt

2. Importing the Data

#importing the dataset
with open('content.json') as content:
data1 = json.load(content)
#getting all the data to lists
tags = []
inputs = []
responses={}
for intent in data1['intents']:
responses[intent['tag']]=intent['responses']
for lines in intent['input']:
inputs.append(lines)
tags.append(intent['tag'])
#converting to dataframe
data = pd.DataFrame({"inputs":inputs,
"tags":tags})
print(data)

output:

The data is stored in a json file, which can be imported and used as a pandas dataframe. This data was manually created by me. hence, it’s not that big.
we all know that deep learning requires large chunks of data. but, That is not the case here. I have utilized a neural network architecture powerful enough to handle this small amount of data

3. Pre-Processing the data

#removing punctuations
import string
data['inputs'] = data['inputs'].apply(lambda wrd:[ltrs.lower() for ltrs in wrd if ltrs not in string.punctuation])
data['inputs'] = data['inputs'].apply(lambda wrd: ''.join(wrd))
#tokenize the data
from tensorflow.keras.preprocessing.text import Tokenizer
tokenizer = Tokenizer(num_words=2000)
tokenizer.fit_on_texts(data['inputs'])
train = tokenizer.texts_to_sequences(data['inputs'])

#apply padding
from tensorflow.keras.preprocessing.sequence import pad_sequences
x_train = pad_sequences(train)

#encoding the outputs
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
y_train = le.fit_transform(data['tags'])

Tensorflow’s tokenizer assigns a unique token to each distinct word. and padding is done to get all the data to the same length so as to send it to an RNN layer. target variables are also encoded to decimal values.

4. Input Length, Output Length and Vocabulary

#input length
input_shape = x_train.shape[1]
print(input_shape)
#define vocabulary
vocabulary = len(tokenizer.word_index)
print("number of unique words : ",vocabulary)
#output length
output_length = le.classes_.shape[0]
print("output length: ",output_length)
outputnumber of unique words : 96
output length: 8

input length and output length are obvious. they are for the input shape and output shape of the neural network. vocabulary size is for the embedding layer to create unique vector representations for each word

5. Neural Network

#creating the model
i = Input(shape=(input_shape,))
x = Embedding(vocabulary+1,10)(i)
x = LSTM(10,return_sequences=True)(x)
x = Flatten()(x)
x = Dense(output_length,activation="softmax")(x)
model = Model(i,x)
#compiling the model
model.compile(loss="sparse_categorical_crossentropy",optimizer='adam',metrics=['accuracy'])
#training the model
train = model.fit(x_train,y_train,epochs=200)

The Network consist of an embedding layer which is one of the most powerful things in the field of natural language processing. the outputs of the embedding layer is the input of the reccurent layer with lstm gate. then, the output is flattened and a regular dense layer is used with a softmax activation function.

The main part is the embedding layer which gives has a corresponding vector for each word in the dataset

Model accuracy

6. Model Analysis
The model got a perfect accuracy of 100%.

7. Testing

#chatting
import random
while True:
texts_p = []
prediction_input = input('You : ')
#removing punctuation and converting to lowercase
prediction_input = [letters.lower() for letters in prediction_input if letters not in string.punctuation]
prediction_input = ''.join(prediction_input)
texts_p.append(prediction_input)
#tokenizing and padding
prediction_input = tokenizer.texts_to_sequences(texts_p)
prediction_input = np.array(prediction_input).reshape(-1)
prediction_input = pad_sequences([prediction_input],input_shape)
#getting output from model
output = model.predict(prediction_input)
output = output.argmax()
#finding the right tag and predicting
response_tag = le.inverse_transform([output])[0]
print("Going Merry : ",random.choice(responses[response_tag]))
if response_tag == "goodbye":
break

output:

You : hi thereGoing Merry :  welcome aboard, how may I help you ?You : how do i join the pirates ?Going Merry :  you need to contact the nearest straw hats center 
once you have completed the pre-requisites
You : what are the pre requisites ?Going Merry : you gotta master the art of swordfight or master Cooking and become a furious cook or you have to be the best navigator in the entire world or you need to be as strong and intelligent as god usopYou : alright , who are you ?Going Merry : Going Merry at your serviceYou : how are you doing ?Going Merry : Yeah , am fineYou : nice , thanks for the responseGoing Merry : Okay, Bye

That’s it for creating a chatbot

About Going Merry

Going merry is a ship from a manga called “One Piece” , which revolves around the story of a bunch of pirates. Going merry is without a doubt, an amazing ship. But, It passed away.
To know , more about going merry : https://onepiece.fandom.com/wiki/Going_Merry

Conclusion

So, This is the Chatbot that I have created with tensorflow2 utilizing the power of embedding matrix. This same method can be used to build chatbots for any type of organization but not a generalized one.

--

--

Ashwin Prasad
Analytics Vidhya

I write about things that intrigue me on any field of Computer Science, with more weightage to Machine Learning and Systems Programming