Writing Conversational Chatbots — The Basics

Shahnewaz Leon
Monstar Lab Bangladesh Engineering
5 min readJan 30, 2020
Photo by Volodymyr Hryshchenko on Unsplash

By 2022, 70% of white-collar workers will interact with conversational platforms on a daily basis” — Gartner

What exactly is a conversational chatbot?

A conversational chatbot is an AI-powered software with a text or voice interface that tries to mimic an intelligent being while interacting with the users. Rather than scrolling, clicking and filling out forms, conversation in our day to day language is the most natural way of interacting and AI-powered conversational chatbots are the most natural steps towards fulfilling that potential.

What are the value additions?

The two most important value additions of AI chatbots are 1. The most natural way of interaction for your customers and 2. Saving time and effort for both the business and the customer. In it’s most simple use case a conversational chatbot can answer the most frequent questions asked by the users saving them the hassle of foraging through your company website to find the relevant information to the most complicated solution where the chatbot can combine diverse knowledge-base and answer the questions tailored for that particular user. The following image from Gartner sums up beautifully how a chatbot can save you time and effort.

Chatbots can make users' life easy! — Gartner

Writing your own chatbot — the prerequisites

A chatbot can tremendously help your business and your user, however, the promise of a chatbot is it’s Achilles heel. For a chatbot to be useful, it has to have the ability to understand what a user is saying and intelligently carry on the conversation. Obviously it may not sound like such a daunting task for us, after all, we have been doing this since we learn to talk. But, don’t forget to appreciate what a challenging feat this is and in the next sections hopefully, I’ll be able to give you an idea of just what it involves in writing an intelligent chatbot. Fortunately for us, in recent years, there has been tremendous advancement in the fields of Machine Learning and Artificial Intelligence which makes the job of writing your own chatbot comparatively easier.

NLP concepts you need to know

At the heart of it is understanding and being able to respond using natural language and the branch of Machine Learning that deals with this is Natural Language Processing. It sounds like a mouthful, but we are going to break it down into the parts we need in order to build a simple chatbot. NLP sits at the intersection of computer science, AI, ML, and computational linguistics. It provides concepts, tools, and algorithms for computers to analyze, understand, and form responses in a smart and useful way. We can use NLP to perform tasks involving natural language such as text summarization, translation, sentiment analysis, topic modeling, named entity extraction.

Getting your text ready for NLP — text pre-processing

Photo by Todd Quackenbush on Unsplash

The first step involved in any type of NLP task is pre-processing the text data to feed into the NLP algorithms. The most commonly used pre-processing steps are Tokenization, Removing noise and stop words, Stemming, Lemmatization.

Tokenization is the process of converting the input text into tokens i.e. words or sentences for further processing. The actual tokenization task depends on the text language. It can be as easy as splitting the text using whitespace characters for languages where words are separated using whitespace or as hard as writing a special algorithm by following the language rules of sentence formation where more complicated sentence making rule is used such as Japanese or Burmese.

The next step is removing noise and stop words. Noise is anything that doesn’t help us understand the text, such as punctuation marks, emoticons, etc. There are some extremely common words or filler words which do not carry any special meaning and do not help us understand the text any better are also removed.

Stemming is the process of reducing inflected words to the base form so as not to treat the different variations of the same word as different words. For example, we understand that the words ‘walking’, ‘walks’, ‘walked’ derives from the same root word ‘walk’. In this step, we reduce all these words to the root word ‘walk’.

Lemmatization is an improved normalization technique that is closely related to stemming. The basic difference is that Stemming uses a crude heuristic process by simply chopping off the tails of words in the hope of reaching the base word form in the majority of the cases. Whereas, in Lemmatization we use vocabulary and morphological analysis of words keeping the context in mind. For example, ‘better’ is a derived version of the word ‘good’. Without the knowledge of the vocabulary and morphological analysis of the word, it is not possible to get to the base word ‘good’. So, in this particular case Stemming will not be successful in reducing the word. However, such cases are rare and Stemming might be the most favorable technique given the accuracy-complexity-execution time tradeoff in most cases.

Our text is now conveniently broken down into tokens, we can focus on converting it into a form which the machine can understand and can be fed into NLP algorithms. We need to represent the words in such a form which is convenient for NLP algorithms. The Bag of Words(BoW) in one such very simple technique which is a representation of text that describes the occurrence of words within a document. BoW doesn’t preserve the ordering of the words and it also ignores the semantics of the words. There are some better alternatives that represent words as vectors (aka. word2vec) and understand the semantics of the words as well.

Now that we understand the basics of what goes into building a chatbot, let’s delve into actually writing a chatbot of our own. In the next article, we will build an AI-powered chatbot using the open-source conversational AI framework Rasa.

Feel free to check out our company blog on medium to read more tech articles. You can visit our website to learn more about us. Till then, cheerio!

--

--

Shahnewaz Leon
Monstar Lab Bangladesh Engineering

Software Engineer with a keen interest in psychology and philosophy