A Quick Guide

How to create a Chatbot in Python

Developing NLP based chatbot using efficient Transformer PART-1

SARVESH KUMAR SHARMA

Published in

Analytics Vidhya

9 min readMar 12, 2021

INTRODUCTION

Chat-Bot, an artificial individual or human who interacts with the human beings or other bot.Conversation can be of a text-based conservation, verbal or non-verbal conversation. It can be accessed through Desktop, Mobile Phones or other peripheral devices. We can see these systems from old classical HTML- based website to modern day E-commerce as well as the food ordering sites.

How Do Chatbots Work?

Chatbots use natural language processing (NLP) and advanced machine learning (ML) algorithms to learn from data insights. NLP is the computer’s ability to understand and process human speech, and respond in a language that’s understandable for humans. This way, it makes the interaction seem like communication between two humans. When it comes to NLP in chatbots, there are 2 processes involved:

Natural Language Understanding (NLU) — This allows the bot to comprehend a human, converting text into structured data for a machine to understand.

Natural Language Generation (NLG) — It transforms structured data into text, making it possible for the human to understand the conversation.

ADVANTAGES OF CHATBOTS

The main advantages of chatbots for customers are as follows:

24*7 support — You can not rely on support agents for 24×7 support. Anytime response is important even after business hours when the team is not available. Chatbots can manage such customer queries with automated responses.
Instant answers — Customers simply do not like to wait for assistance — any wait time can lead to frustration and potential churn. Chatbots are a smarter way to ensure that customers receive the instant response that they demand.
Order without human help — Businesses can leverage chatbots to automate bookings of orders and appointments so that customers can instantly book from the website or Facebook page. 47% of consumers would buy items from a chatbot.

In this Article, We are Going to create a chat bot using the Reformer, also known as the efficient Transformer, to generate a dialogue between two bots. You will feed conversations to your model and it will learn how to understand the context of each one. Not only will it learn how to answer questions but it will also know how to ask questions if it needs more info. For example, after a customer asks for a train ticket, the chat bot can ask what time the said customer wants to leave. We can use this concept to automate call centers, hotel receptions, personal trainers, or any type of customer service. through out, we will:

Understand how the efficient Transformer works.
Explore the MultiWoz dataset.
Process the data to feed it into the model.
Train your model.
Generate a dialogue by feeding a question to the model.

Outline

Exploring the MultiWoz dataset
Processing the data for Reformer inputs
Tokenizing, batching with bucketing
Reversible layers
Reversible layers and randomness
ReformerLM Training
Decode from a pretrained model

1: Exploring the MultiWoz dataset

You will start by exploring the MultiWoz dataset. The dataset weare about to use has more than 10,000 human annotated dialogues and spans multiple domains and topics. Some dialogues include multiple domains and others include single domains.we will load and explore this dataset, as well as develop a function to extract the dialogues.

Let’s first import the modules we will be using:

import json
import random
import numpy as np
from termcolor import coloredimport trax   
from trax import layers as tl
from trax.supervised import training
!pip list | grep trax

Let’s also declare some constants we will be using in the exercises.

# filename of the MultiWOZ dialogue dataset
DATA_FILE = 'data.json'# data directory
DATA_DIR = './data'# dictionary where we will load the dialogue dataset
DIALOGUE_DB = {}# vocabulary filename
VOCAB_FILE = 'en_32k.subword'# vocabulary file directory
VOCAB_DIR = 'data/vocabs'

Let’s now load the MultiWOZ 2.1 dataset. we have our dataset in json format.

# help function to load a JSON file
def load_json(directory, file):
    with open(f'{directory}/{file}') as file: 
        db = json.load(file)
    return db# load the dialogue data set into our dictionary
DIALOGUE_DB = load_json(DATA_DIR, DATA_FILE)

Let’s see how many dialogues we have in the dictionary. 1 key-value pair is one dialogue so we can just get the dictionary’s length.

print(f'The number of dialogues is: {len(DIALOGUE_DB)}')
#Output----------------
#The number of dialogues is: 10438

The dialogues are composed of multiple files and the filenames are used as keys in our dictionary. Those with multi-domain dialogues have “MUL” in their filenames while single domain dialogues have either “SNG” or “WOZ”.

# print 7 keys from the dataset to see the filenames
print(list(DIALOGUE_DB.keys())[0:7])
#Output----------------
#['SNG01856.json', 'SNG0129.json', 'PMUL1635.json', 'MUL2168.json', 'SNG0073.json', 'SNG01445.json', 'MUL2105.json']

there are 10,438 conversations, each in its own file. You will train your model on all those conversations. Each file is also loaded into a dictionary and each has two keys which are the following:

# get keys of the fifth file in the list above
print(DIALOGUE_DB['SNG0073.json'].keys())
#Output-----------
dict_keys(['goal', 'log'])

The goal also points to a dictionary and it contains several keys pertaining to the objectives of the conversation. For example below, we can see that the conversation will be about booking a taxi.

DIALOGUE_DB['SNG0073.json']['goal']
#output:-
#{'taxi': {'info': {'leaveAt': '17:15',
   'destination': 'pizza hut fen ditton',
   'departure': "saint john's college"},
  'reqt': ['car type', 'phone'],
  'fail_info': {}},
 'police': {},
 'hospital': {},
 'hotel': {},
 'attraction': {},
 'train': {},
 'message': ["You want to book a <span class='emphasis'>taxi</span>. The taxi should go to <span class='emphasis'>pizza hut fen ditton</span> and should depart from <span class='emphasis'>saint john's college</span>",
  "The taxi should <span class='emphasis'>leave after 17:15</span>",
  "Make sure you get <span class='emphasis'>car type</span> and <span class='emphasis'>contact number</span>"],
 'restaurant': {}}

The log on the other hand contains the dialog. It is a list of dictionaries and each element of this list contains several descriptions as well. Let's look at an example:

# get first element of the log list
DIALOGUE_DB['SNG0073.json']['log'][0]

output

{'text': "I would like a taxi from Saint John's college to Pizza Hut Fen Ditton.",
 'metadata': {},
 'dialog_act': {'Taxi-Inform': [['Dest', 'pizza hut fen ditton'],
   ['Depart', "saint john 's college"]]},
 'span_info': [['Taxi-Inform', 'Dest', 'pizza hut fen ditton', 11, 14],
  ['Taxi-Inform', 'Depart', "saint john 's college", 6, 9]]}

Currently, we are only interested in the conversation which is in the text field. The conversation goes back and forth between two persons. Let's call them 'Person 1' and 'Person 2'. This implies that data['SNG0073.json']['log'][0]['text'] is 'Person 1' and data['SNG0073.json']['log'][1]['text'] is 'Person 2' and so on. The even offsets are 'Person 1' and the odd offsets are 'Person 2'.

print(' Person 1: ', DIALOGUE_DB['SNG0073.json']['log'][0]['text'])
print(' Person 2: ',DIALOGUE_DB['SNG0073.json']['log'][1]['text'])
#OutPut-----------------
#Person 1:  I would like a taxi from Saint John's college to Pizza #Hut Fen Ditton.
 #Person 2:  What time do you want to leave and what time do you #want to arrive by?

We will now implement the get_conversation() function that will extract the conversations from the dataset's file. the conversation is in the text field in each of the elements in the log list of the file. If the log list has x number of elements, then the function will get the text entries of each of those elements. Our function should return the conversation, prepending each field with either ' Person 1: ' if 'x' is even or ' Person 2: ' if 'x' is odd. We can use the Python modulus operator '%' to help select the even/odd entries.

def get_conversation(file, data_db):
    '''
    Args:
        file (string): filename of the dialogue file saved as json
        data_db (dict): dialogue database
    
    Returns:
        string: A string containing the 'text' fields of  data[file]['log'][x]
    '''
    
    # initialize empty string
    result = ''
    
    # get length of file's log list
    len_msg_log = len(data_db[file]['log'])
    
    # set the delimiter strings
    delimiter_1 = ' Person 1: '
    delimiter_2 = ' Person 2: '
    
    # loop over the file's log list
    for i in range(len_msg_log):
        
    ### START CODE HERE (REPLACE INSTANCES OF 'None' WITH YOUR CODE) ###
    
        # get i'th element of file log list
        cur_log = data_db[file]['log'][i]
        
        # check if i is even
        if i%2 == 0:                   
            # append the 1st delimiter string
            result += delimiter_1
        else: 
            # append the 2nd delimiter string
            result += delimiter_2
        
        # append the message text from the log
        result += cur_log['text']
    
    ### END CODE HERE ###

    return result

testing our function

# BEGIN UNIT TEST
import w4_unittest
w4_unittest.test_get_conversation(get_conversation)

File Testing:-

file = 'SNG01856.json'
conversation = get_conversation(file, DIALOGUE_DB)

# print raw output
print(conversation)

We can have a utility pretty print function just so we can visually follow the conversation more easily.

def print_conversation(conversation):
    
    delimiter_1 = 'Person 1: '
    delimiter_2 = 'Person 2: '
    
    split_list_d1 = conversation.split(delimiter_1)
    
    for sublist in split_list_d1[1:]:
        split_list_d2 = sublist.split(delimiter_2)
        print(colored(f'Person 1: {split_list_d2[0]}', 'red'))
        
        if len(split_list_d2) > 1:
            print(colored(f'Person 2: {split_list_d2[1]}', 'green'))

            
print_conversation(conversation)

Output:-

Person 1: am looking for a place to to stay that has cheap price range it should be in a type of hotel 
Person 2: Okay, do you have a specific area you want to stay in? 
Person 1: no, i just need to make sure it's cheap. oh, and i need parking 
Person 2: I found 1 cheap hotel for you that includes parking. Do you like me to book it? 
Person 1: Yes, please. 6 people 3 nights starting on tuesday. 
Person 2: I am sorry but I wasn't able to book that for you for Tuesday. Is there another day you would like to stay or perhaps a shorter stay? 
Person 1: how about only 2 nights. 
Person 2: Booking was successful.
Reference number is : 7GAWK763. Anything else I can do for you? 
Person 1: No, that will be all. Good bye. 
Person 2: Thank you for using our services.DIALOGUE_DB['SNG01856.json']['log'][0]

Output:-

{'text': 'am looking for a place to to stay that has cheap price range it should be in a type of hotel',
 'metadata': {},
 'dialog_act': {'Hotel-Inform': [['Type', 'hotel'], ['Price', 'cheap']]},
 'span_info': [['Hotel-Inform', 'Type', 'hotel', 20, 20],
  ['Hotel-Inform', 'Price', 'cheap', 10, 10]]}

The dataset also comes with hotel, hospital, taxi, train, police, and restaurant databases. For example, in case you need to call a doctor, or a hotel, or a taxi, this will allow you to automate the entire conversation. Take a look at the files accompanying the data set.

# this is an example of the attractions file
attraction_file = open('data/attraction_db.json')
attractions = json.load(attraction_file)
print(attractions[0])

this is an example of the attractions file:-

#Output----------------#{'address': 'pool way, whitehill road, off newmarket road', 'area': 'east', 'entrance fee': '?', 'id': '1', 'location': [52.208789, 0.154883], 'name': 'abbey pool and astroturf pitch', 'openhours': '?', 'phone': '01223902088', 'postcode': 'cb58nt', 'pricerange': '?', 'type': 'swimmingpool'}

this is an example of the hospital file:-

hospital_file = open('data/hospital_db.json')
hospitals = json.load(hospital_file)
print(hospitals[0]) # feel free to index into other indices#Output-------------#{'department': 'neurosciences critical care unit', 'id': 0, 'phone': '01223216297'}

this is an example of the hotel file:-

hotel_file = open('data/hotel_db.json')
hotels = json.load(hotel_file)
print(hotels[0]) # feel free to index into other indices#Output----------------{'address': '124 tenison road', 'area': 'east', 'internet': 'yes', 'parking': 'no', 'id': '0', 'location': [52.1963733, 0.1987426], 'name': 'a and b guest house', 'phone': '01223315702', 'postcode': 'cb12dp', 'price': {'double': '70', 'family': '90', 'single': '50'}, 'pricerange': 'moderate', 'stars': '4', 'takesbookings': 'yes', 'type': 'guesthouse'}

this is an example of the police file:-

police_file = open('data/police_db.json')
police = json.load(police_file)
print(police[0]) # feel free to index into other indices#Output----------------{'name': 'Parkside Police Station', 'address': 'Parkside, Cambridge', 'id': 0, 'phone': '01223358966'}

this is an example of a restuarant file:-

restaurant_file = open('data/restaurant_db.json')
restaurants = json.load(restaurant_file)
print(restaurants[0]) # feel free to index into other indices#Output----------------{'address': 'Regent Street City Centre', 'area': 'centre', 'food': 'italian', 'id': '19210', 'introduction': 'Pizza hut is a large chain with restaurants nationwide offering convenience pizzas pasta and salads to eat in or take away', 'location': [52.20103, 0.126023], 'name': 'pizza hut city centre', 'phone': '01223323737', 'postcode': 'cb21ab', 'pricerange': 'cheap', 'type': 'restaurant'}

For more information about the multiwoz 2.1 data set, Let’s print the ReadMe.txt file.

with open('data/README') as file:
    print(file.read())

As We can see, there are many other aspects of the MultiWoz dataset. Nonetheless, We'll see that even with just the conversations, our model will still be able to generate useful responses.

This concludes our exploration of the dataset. In the next Part, we will do some preprocessing before we feed it into our model for training.

Find the data-set and source-code at https://github.com/shsarv/ChatBot .