Building a Simple Chatbot using Deep Learning

Fatih İnci
6 min readMay 20, 2023

--

In this tutorial, we will walk through the process of building a simple chatbot using deep learning techniques. Chatbots are computer programs that interact with users in natural language, and they have become increasingly popular for various applications such as customer support, information retrieval, and personal assistants. Our chatbot will be able to understand user input and generate appropriate responses based on the trained model.

Prerequisites

To follow along with this tutorial, you will need the following libraries installed:

  • NumPy: A library for numerical computing in Python.
  • TensorFlow: An open-source deep learning framework.
  • scikit-learn: A library for machine learning tasks.

You can install these libraries using pip by running the following command:

$ pip install numpy tensorflow scikit-learn

Step 1: Preparing the Training Data

The first step in building a chatbot is to prepare the training data. We need two sets of data: the input data (user messages or queries) and the corresponding output data (bot responses).

Here’s an example of the training data we will use:

train_data = [
"Hello",
"How are you?",
"Good morning",
"Good evening",
"Nice to meet you",
"What's up?",
"How's your day going?",
"Greetings!",
"Good afternoon",
"How can I assist you?",
"Pleasure to see you",
"Is there anything I can help with?"
]

train_labels = [
"Hi",
"I'm fine, how about you?",
"Good morning to you",
"Good evening, how can I help you?",
"Nice to meet you too",
"Not much, just hanging out",
"It's going well, thank you",
"Hello!",
"Good afternoon to you too",
"I'm here to assist you",
"Likewise!",
"Yes, I have a question"
]

The train_data list contains the user messages, and the train_labels list contains the corresponding bot responses.

Step 2: Data Preprocessing

Before we can train our chatbot model, we need to preprocess the training data. In this step, we will perform the following tasks:

Label Encoded Data
Encode the labels

1.Encode the labels: We will use the LabelEncoder from scikit-learn to convert the text labels into numerical values. This step is necessary because deep learning models require numerical inputs.

Tokenize and pad the text data
Tokenize and pad the text data

2. Tokenize and pad the text data: We will use the Tokenizer from TensorFlow to tokenize the text data into sequences of integers. This step converts each word into a unique integer value. We will also pad the sequences to ensure they have the same length, which is necessary for training the model.

Here’s the code to preprocess the data:

from sklearn.preprocessing import LabelEncoder
from tensorflow import keras

label_encoder = LabelEncoder()
encoded_labels = label_encoder.fit_transform(train_labels)

tokenizer = keras.preprocessing.text.Tokenizer()
tokenizer.fit_on_texts(train_data)
train_sequences = tokenizer.texts_to_sequences(train_data)
train_sequences = keras.preprocessing.sequence.pad_sequences(train_sequences)

The LabelEncoder is fitted on the train_labels to learn the unique labels and assign them numerical values. The transform method is then used to encode the labels.

The Tokenizer is fitted on the train_data to learn the unique words and assign them integer values. The texts_to_sequences method is used to convert the text data into sequences of integers based on the learned mapping. Finally, the pad_sequences method is used to ensure all sequences have the same length by padding or truncating them.

Step 3: Building and Training the Chatbot Model

With the training data prepared, we can now build our chatbot model. In this example, we will use an Embedding layer to handle the text data. The Embedding layer maps the input sequence of integers to a dense vector representation. This layer is often used in natural language processing tasks to capture the semantic meaning of words. Here's the code:

#Sequential groups a linear stack of layers into a tf.keras.Model.
model = keras.models.Sequential()

model.add(keras.layers.Embedding(len(tokenizer.word_index) + 1, 100, input_length=train_sequences.shape[1]))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(64, activation='relu'))
model.add(keras.layers.Dense(len(train_labels), activation='softmax'))
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

model.fit(train_sequences, encoded_labels, epochs=50)

The Embedding layer takes two arguments: the vocabulary size, which is determined by the total number of unique words in our training data (len(tokenizer.word_index) + 1), and the embedding dimension (100 in this case), which specifies the size of the dense vector representation for each word.

Flatten and Embedding Example

The Flatten layer is added to convert the multi-dimensional output of the Embedding layer into a one-dimensional vector, which can be fed into the subsequent dense layers.

Dense Layer

The following dense layer with ReLU activation introduces non-linearity to the model, allowing it to learn complex patterns in the data. Finally, the output layer uses the softmax activation function to produce probability scores for each class label.

We compile the model using the Adam optimizer and the sparse categorical crossentropy loss function, which is suitable for multi-class classification problems. During training, the model adjusts its internal parameters to minimize the defined loss function and maximize accuracy.

We train the model using the fit method, specifying the input sequences (train_sequences) and the corresponding encoded labels (encoded_labels). We set the number of epochs to 50, indicating the number of times the model will iterate over the entire training dataset.

ReLU: Rectified Linear Unit and Softmax
ReLU: Rectified Linear Unit and Softmax

What is the epochs?

What Is an Epoch? The number of epochs is a hyperparameter that defines the number times that the learning algorithm will work through the entire training dataset.

Step 4: Generating Responses

Now that our chatbot model is trained, we can use it to generate responses based on user input. We define a function called generate_response that takes a text input, tokenizes it, pads the sequence, and makes predictions using the trained model. The predicted label is then converted back into its original text form using the inverse_transform method of the LabelEncoder. Here's the code:

import numpy as np

def generate_response(text):
sequence = tokenizer.texts_to_sequences([text])
sequence = keras.preprocessing.sequence.pad_sequences(sequence, maxlen=train_sequences.shape[1])
prediction = model.predict(sequence)
predicted_label = np.argmax(prediction)
response = label_encoder.inverse_transform([predicted_label])[0]
return response

To interact with the chatbot, we can use a loop that continuously prompts the user for input and generates responses using the generate_response function. Here's an example:

while True:
user_input = input("Enter a message: ")
response = generate_response(user_input)
print("ChatBot: ", response)

This loop will keep running until the user chooses to exit.

Conclusion

In this tutorial, we have built a simple chatbot using deep learning techniques. We learned how to preprocess the training data, build an Embedding layer-based model, and generate responses based on user input. You can further enhance the chatbot by adding more training data, experimenting with different architectures, and exploring advanced techniques such as attention mechanisms or transformer models.

Remember, building a sophisticated chatbot often requires a larger dataset, more complex models, and extensive fine-tuning. However, this tutorial serves as a starting point for creating your own chatbot and understanding the basic concepts involved.

Feel free to experiment and adapt the code to your specific use case. Happy coding!

github repo : https://github.com/fatih255/building-a-simple-chatbot-with-python

Do not hesitate to point out the wrong parts of the work that I am curious about and have done for trial purposes. :)

--

--