How to create a chatbot in 3 easy steps? (T5 Transformer)

Published in

Analytics Vidhya

3 min readAug 13, 2021

A 3 step tutorial on how to use my library to create a contextual chatbot (No deep learning required because I did it for you) and deploy it on Reddit/Telegram/Mobile Applications

Use Case

Ever wanted to make a chatbot?

We made it easy for you, using sub-reddits your chatbot can learn how people talk/respond to certain questions/replies.

And what if I told you, you can just make it with almost no coding required and deploy it anywhere you want further.

Here is a chatbot I made using my library on data trained from r/mentalhealth subreddit.

Demo:

The training data is gathered from reddit.com/r/mentalheath

Chatformer is a end to end implementation of Chatbot using a powerful Transformer model called T5.

Here I will show you how you can create, update, train, and analyze the chatbot and deploy it anywhere like Telegram/Reddit or use it in your own website’s platform.

Can’t wait already?
Open the following Google Colab notebook to use it on the go.

Link to colab notebook

Training Data

Don’t have training data?
Have a subreddit in mind which contextually have your use case?
We got you covered!

For the following example we would use reddit to get the training data.

STEP 1:

Clone

Clone this repo to your local machine using

$ git clone https://github.com/Ar9av/transformer-nmt-chatbot.git

change the working directory

$ cd transformer-nmt-chatbot

Setup

Install the requirements using the following commands, we might need chromium driver to create our training data

$ pip install -r requirements.txt
$ sudo apt-get install chromium-chromedriver

STEP 2:

Training Data based on Reddit’s conversation threads

You can directly train it over reddit conversations just by providing the subreddits and number of pages for which you want the data.

You can configure this using config.yml and change the reddit_data to True.

You can mention the subreddits, pages, sorting criteria in reddit_config.yml.

Example template for reddit_config.yml

Training Data based on custom dataset

If you’re using your own custom dataset keep it in the following format.

Change the parameter reddit_data in config.yml to False.

The training data should be inside the data folder.

The conversation data should be kept in 2 files train.to and train.from. Each line denotes the id of each 1-1 conversation in from and to form.

File train.to:

Hey
How are you

File train.from

Hi
I am fine

STEP 3 :

Sit back all you have do just run the following

Training

Configuring the Training Parameters

Change the parameters in config.yml and change the type to train and run the following command.

$ python main.py

IGNORE IF NOT NERD

For you nerds out there who want to play with the training parameters, configure the following:

Want to deploy the app to a subreddit?
Reddit Auto Reply Scripting

After training over a subreddit data, we can use the model to interference through the the comments and generate reply using reddit_bot.py

Configure the bot (app), user credentials in reddit_credentials.yml

Step 4:

Relax, you are done sir.

Here have an Indian chai and talk to your new chatbot.

Suggestions!?

Why not contribute!?
- Option 1
🍴 Fork this repo!
🔃 Create a new pull request using hhttps://github.com/Ar9av/transformer-nmt-chatbot/compare/.
- Option 2
👯 Clone this repo to your local machine using
$ git clone https://github.com/Ar9av/transformer-nmt-chatbot.git

Doubts, suggestions, reach me out.

My LinkedIn