Building Your Own Natural Language FAQs

Published in

Feersum Engine

9 min readOct 24, 2017

Introduction

Imagine that you and the kids are on your way to visit your mother, your car fails and you become stranded. A virtual assistant on your phone or in your car capable of guiding you step-by-step through changing a tire, checking electrical fuses or filling up your car’s coolant would be a real life saver, wouldn’t it?

Alternatively, imagine how much easier tasks like filling in your tax return or reading through a contract could be if you had an expert in your pocket. Such virtual assistants also called chatbots exist. These are software programs that can understand what you type, quickly reference various knowledge bases and formulate responses in easy-to-understand natural language.

Feersum Engine is a conversation engine and a collection of Natural Language Understanding (NLU) APIs optimised for Africa. We reuse open source NLU building blocks when possible and develop our own algorithms when required. Some of the chatbots that we’ve worked on are helping young mothers with health and clinic related questions, assisting people to take out insurance and submit claims or aiding attendees at conferences to navigate both the venue and schedule.

Although chatbot technology is unfortunately not yet at a level where one can create general virtual assistants like Iron Man’s J.A.R.V.I.S., we are working hard on making such systems a reality. In this post I will explain how you can build a natural language FAQ system which is an important part of many chatbots.

Following on From PyConZA

I was at PyConZA recently to talk about building text based natural language FAQs. PyConZA is the annual gathering of the South African community using and developing the open-source Python programming language. This year the conference was held at The River Club in Observatory, Cape Town.

PyConZA Talk - Video and slides available online.

This post follows on from the talk at PyConZA and I’ll go into the topic of building natural language FAQs in more detail than what is possible during a conference talk. I’ll also share our FeersumNLU playground where readers can go and build their own FAQs, as well as experiment with other NLU functions such as intent detection and information extraction.

Before we have some fun building our own natural language FAQ please take a few minutes to read the below overview of the technology behind it all.

The Technology behind Natural Language FAQs

The goal of a natural language FAQ is to interpret a user’s question and then find the best matching questions from your list of frequent questions. The system may additionally be able to report when a user’s question is not adequately covered by your list of questions.

FAQs may therefore be used as a simple form of question answering in cases where all the possible questions (and answers) are known. In cases where the questions and answers cannot be known beforehand, one must implement more advanced question answering. Interested readers can have a look at the Stanford Question Answering Dataset (SQuAD) and shared task.

To find the best matching questions from a list of frequent questions, one needs a way to measure the semantic similarity of sentences. To do this, we generate a sentence vector for each question and then use a metric, such as the inner product of two sentences to measure their semantic similarity. A good overview of sentence vectors and semantic similarity may be found in A Simple but Tough-to-Beat Baseline for Sentence Embeddings, published by Arora et al. in 2017. We’re in the process of publishing our FeersumNLU approach to semantic similarity so I’ll present Arora et al.’s method in this post.

The sentence vector of a question is calculated by taking the weighted average over the word vectors of a sentence. Word vectors are learnt unsupervised from a large corpus of unlabelled text. These vectors are context based and low dimensional compared to the size of the vocabulary. Readers interested to know more can refer to Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors, published by Baroni et al. in 2014. GloVe, Word2Vec or similar 200 to 300 dimensional vectors with embedding manifolds of 400k to 1M words are typically used in practice. In my PyConZA presentation I also mentioned the manifold hypothesis on why machine learning in general and word vectors in particular work. It is well worth reading up on.

When calculating the sentence vector, each word vector is weighted by the importance of its word. The importance is estimated to be inversely proportional to the word’s frequency in the corpus. Put differently, words that are used often are considered less important and words used less often are assumed to be semantically more important.

It was also found that one may improve the accuracy of the similarity measure if one first removes the principle vector component of the list of frequent questions. It is interesting to note that the principle component turns out to be similar to the vectors of words like is and just that represent more structural than semantic language aspects.

To gain some more insight into how semantic similarity works, the image below shows a 50 dimensional sentence vector for The cat is in the house. The different dimensions are shown on the x-axis and the projection of the sentence vector along each dimension is shown on the y-axis. The inner product of a sentence with itself would be 1.0.

The next image shows the sentence vector for The cat is in the house in blue vs. the sentence vector for The cat is in the garden in red. Note the differences in the two vectors due to the cat being in the garden as opposed to in the house. The inner product of these two sentences is 0.886.

The image below shows the sentence vector for The cat is in the house in blue vs. The dog is in the garden in red. Since the cat in the second sentence has been replaced by a dog, the difference between the two vectors is greater than before and the inner product is 0.841.

The image below shows the difference between two quite dissimilar sentences The cat is in the house vs. I need help with my car. The inner product now is 0.498.

Looking at the three example sentence pairs shown above the similarity of the sentence vectors does seem to be a good indication of the semantic similarity. The FeersumNLU sentence similarity algorithm uses a different sentence vector construction and an L1 similarity metric, but is conceptually similar to Arora et al.’s algorithm. Also, multi-lingual FAQs are built by automatically creating an FAQ model for each language found in the frequent questions. During prediction the algorithm first detects the user’s language and then runs the appropriate FAQ model. We’re busy researching how one could instead use one universal word embedding and a single FAQ model across all languages.

Now that you know what a sentence vector is, how to measure the semantic similarity between two sentences and how a natural language FAQ operates we can go through an example of using an NLU API to build a natural language FAQ.

Building your own Natural Language FAQ

For this example I’ll be using the FeersumNLU Playground which is a publicly available instance of the natural language understanding service of Feersum Engine. A similar FAQ, at least in English, could alternatively be built with other NLU services like wit.ai (using intents) and Microsoft’s QnA Maker. The benefits of FeersumNLU over alternative NLU services include that it is developed in South Africa with locally collected data, hosted locally and designed with resource limited languages in mind.

Readers that are inclined to get their hands dirty can play with the example code themselves. The link to the FeersumNLU playground service and the Medium_FAQ demo Python notebook may be found in the github repo at https://github.com/praekelt/feersum-nlu-api-wrappers. I’ll be using the Python language API wrapper which requires that one install it first. However, if you wish to get just one file from the repo or if you are unfamiliar with Python then you may download the curl version of the Medium_FAQ demo notebook. The curl version is an editable linux bash script that directly accesses the HTTP API.

The following snippets show first how to setup the Python wrapper for the FeersumNLU Playground, then how to train a dual language FeersumNLU FAQ model and finally how to match a user’s question to an FAQ using the trained model.

The below snippet imports the feersum_nlu API wrapper module and configures the key and service instance to use. The API requires a key (authentication token) to allow access to the service. The github page has the details on how to get your own key for the FeersumNLU Playground instance.

Snippet to initialise the Python language wrapper for the FeersumNLU playground.

The FAQ training data is a list of labelled questions. The questions are labelled with either the matching answer’s unique ID or text string. FeersumNLU will automatically bin the questions by language and effectively construct an FAQ for each language found in the training data. It supports the 11 official South African languages by default, but it is possible to configure the language identification model and to provide language hints as part of the training data.

The below snippet shows how the training samples are defined. For this example there are three answer labels faq_tire, faq_engine and faq_accident. In other words this model will cover three frequently asked questions on tires, your engine, and having had an accident. To increase the accuracy of the model four examples of how each question could be asked are provided. In practice an FAQ model will have more than three answer labels and could easily have more than four example questions per answer.

To make an FAQ model one needs to create an FAQ model, add the training samples to it and then train the model as shown below.

Once trained the FAQ model may be used to match new user questions to one of the question and answer pairs in the training set. This is done using the retrieve operation on the trained model with the user’s question. For example, the model matches the user text I think I need petrol to the faq_engine label in which case the bot could respond with a message such as If you're having engine trouble or need fuel, ... .

Snippet to make a prediction using the trained FAQ model.

Although not semantically too similar to the training data, the above example was chosen to show the power of context based word vectors that associated a sentence about petrol with the the training samples about engine. The next snippet shows another example.

Snippet to make a prediction using the trained FAQ model.

The Afrikaans sentence Ek het my kar verongeluk (which means I crashed my car) correctly matches to the label faq_accident. Once again the context based word vectors are useful to match the verb form verongeluk to the noun ongeluk in the training samples. Therefore, the chatbot would be able to respond appropriately.

The retrieve operation could return a sorted list of matches if the user’s text is similar to frequent questions from more than one label. A chatbot could then display just the top match or perhaps the top two or three to help users formulate their questions.

Conclusion

I hope that this post was fun and informative. If anyone is interested in using Feersum Engine or FeersumNLU in their own adventures please let us know. We can help with building chatbots, creating NLU models specialised for your domain as well as with hosting.