Question Answering for Conversational Interfaces

Arushi Raghuvanshi
MindMeld Blog
Published in
9 min readSep 10, 2020
Image Credit: Automatic Question Answering

Question answering is one of the most universal and challenging problems in Natural Language Processing. Many tasks in NLP can be modeled as a question answering problem over language input, and it is a key component in areas such as reading comprehension, web search, and dialog systems.

Recent academic advancements in question answering have focused on the task of answering a question relating to a specified document. However, for many practical applications, including conversational interfaces, the crucial step is retrieving the right document or data point from a large set of documents. In this post, we will introduce question answering in the context of conversational interfaces, demonstrate the supported QA functionality in MindMeld, and compare it to other frameworks.

Reading comprehension is one area of question answering. In this task, the question answerer is fed both a question and a document or passage. The model is responsible for understanding the question and extracting the answer, usually a word or phrase from the document. All real-world information needed to answer the question is provided to the model.

Information retrieval is another area of question answering for which the goal is to retrieve the most relevant document from a database given some text or contextual information. The former task has seen the most attention from academia in the last few years, while the latter tends to be more of the focus for dialog systems. In addition to these tasks, dialog systems often need components to understand the user’s questions, extract key search terms form the query, and generate a natural language response.

An example passage, question, answer set from SQuAD. The example model uses features, like the highlighted information from the dependency tree, to extract the answer from the provided passage. Image Credit: The Stanford Question Answering Dataset

For dialog systems, the NLP pipeline, including intent classifiers and entity recognizers, is designed to extract the relevant information needed to understand the question. The format of domain, intent, and entity classification models has repeatedly been shown to work well for dialog systems and has become an industry standard. The classifications from these models are meant to allow the developer to understand the query or question well enough to form a response. Given this context of the NLP outputs, it is the developer’s responsibility to answer the question or complete the task.

Pipeline of models to understand the user question or query. More information in the MindMeld docs.

In some cases, the NLP classifications are enough to completely specify the question and context, so as a developer, you can answer the question simply by providing a templated response. For example, you may define an “identity” intent to answer questions relating to who the user is talking to. If the user says something like “Am I talking to a bot?”, you can respond to this query, and all others classified as the “identity” intent, with a templated response like “Hi there, yes, I am your assistant bot. How can I help?” All the popular platforms provide support for this format of leveraging the NLP pipeline for developer-defined question answering. They leave it up to the developer to define the domains, intents, entities, collect enough data to train the models, and create templated responses.

Example of a simple templated response for QA.

In many cases, you need some additional information or real-world context to answer the user’s question or complete the request. A Knowledge Base is a database that stores information needed for your use case. The Question Answerer component is what is used to query the knowledge base easily.

As an example, consider the query “Which artist plays the song Bohemian Rhapsody?” Your domain and intent classifiers can recognize this as a “music” domain and “browse artist” intent, and your entity recognizer can extract “Bohemian Rhapsody” as a “song title” entity. But to answer the question, you still need to query a knowledge base that contains information about songs, including titles and artist names.

In some generic cases, you may be able to leverage an existing API to query for this type of information, but for many applications, you need a custom application-specific database to get the best results. In both cases, you may need to consider fuzzy matching, semantic matching, phonetic matching, sorting, or filtering based on the context of the user and the conversation, and more, to get the most relevant set of results. This problem of retrieving relevant documents and fields from a structured knowledge base is what we call question answering on structured documents.

Example of a QA response which requires a call to a structured knowledge base.

Finally, there may be cases where you don’t want to define a new domain or intent for every new question you would like to support. Instead of relying on the NLP pipeline to understand the question, you may want to feed the entire user query to a QA system and get the most relevant document. One common example of this scenario is for FAQ-style question and answer pairs. You can store an evolving set of question-answer pairs in your knowledge base, and when an FAQ query comes in, return the document that matches most closely. Another common example is for search on unstructured documents. You can store unstructured blobs of text in your knowledge base, and for a given user query, return the most relevant document. While leveraging the NLP classifiers is generally more accurate and easier to evaluate for small to mid-sized datasets, question answering on unstructured text requires less developer time and resources, and it performs reasonably well for many use cases.

Example of a QA response for unstructured text.

The general approach of using the NLP pipeline is a broad topic that’s described in detail in the MindMeld documentation. Below, we will take a closer look at the two approaches that involve querying a knowledge base (with structured or unstructured text) via the MindMeld question answerer.

MindMeld question answering with a structured knowledge base

Building a custom knowledge base using application data is straightforward in MindMeld. The data can be restaurant menus, retail product catalogs, or any custom data that users would like to interact with through the conversational interface.

The question answerer takes in data files that contain knowledge base objects, which are the basic unit of information in knowledge base indexes. Each data file contains objects of a specified type. Each object has:

  • an ID field as the unique identifier
  • a list of arbitrary data fields of different types that contain information about the object, or about relationships with other object types.

To efficiently and accurately retrieve the most relevant information, the question answerer creates optimized indexes. The question answerer processes all fields in the data, determining the data field types and indexing them accordingly. For example, for text fields, the index is set up to take care of exact matching, matching on normalized text, n-grams for phrase matching, character n-grams for partial word matching and misspellings, deep embedding based semantic matching (more details to come in our next post), and more. For numeric fields, the index is set up to sort and filter by value. And so on.

By loading your knowledge base via the MindMeld question answerer, you get easy-to-use, advanced information retrieval support with minimal effort:

mindmeld load-kb food_ordering restaurants data/restaurants.json

Once your knowledge base is loaded, you can query the index using the question answering search API. This example query searches the menu items index for dishes with a name similar to “pad thai”:

app.question_answerer.get(index='menu_items', name='pad thai')

You can also search against multiple terms:

app.question_answerer.get(index='menu_items', name='pad thai', 
option='tofu')

You can filter as needed. For example, this is useful if you have context from a previous query in which the user already specified the restaurant they would like to order from:

app.question_answerer.get(index='menu_items', name='pad thai', 
option='tofu', restaurant_id='B01CGKGQ40')

You can also sort the relevant results by desired factors. For example, if you have context from the user that they prefer cheaper options, you can sort by price:

app.question_answerer.get(index='menu_items', name='pad thai', 
option='tofu', restaurant_id='B01CGKGQ40',
_sort='price', _sort_type='asc')

We’ve found that for task-oriented dialog systems, if the developer has the time to build their question answering system in this more structured way, it will perform most efficiently and reliably.

For more details on QA on a structured knowledge base, refer to the MindMeld documentation.

MindMeld question answering for unstructured text

In some cases, using a structured knowledge base may be impractical. For example, you have a large set of unstructured documents that can’t reasonably be turned into a structured format. These documents may change frequently, and you might not have the resources to update your NLP pipeline at the same rate. Or, in general, you might not have the bandwidth or technical expertise to develop training data for domain/intent/entity classifiers needed for a more structured approach.

The difference between a structured knowledge base and an unstructured one is in how MindMeld handles the search query. Internally, while the ranking algorithm remains the same for both cases, the features extracted for ranking are different and are optimized to handle long text passages rather than keyword phrases.

To search the knowledge base over unstructured text, we use the get() API with one small modification. We set the query_type parameter to 'text'. Under the hood, this search performs additional optimizations, such as stemming and stop word removal, which gives better results on unstructured data:

app.question_answerer.get(index='faq_data', query_type='text', 
question=query, answer=query)

For more details on QA for unstructured text, refer to the MindMeld documentation.

Comparison with other platforms

There are many other platforms for building conversational interfaces, including Dialogflow, Wit.ai, Amazon Lex, RASA, and Microsoft LUIS. While all of these platforms have converged on the idea that the NLP classifiers can carry much of the load of understanding the question, they provide various levels of QA support for actually answering the question or completing the task.

In Dialogflow, knowledge connectors have been released as a complement to the traditional intent approach. If your agent doesn’t classify an incoming query as a predefined intent, you can configure your agent to look at the knowledge base for a response. This provides support for FAQ-style question answering. While this is similar to the unstructured QA support in MindMeld, it is slightly less flexible since it is mostly invoked as a fallback. Dialogflow doesn’t provide explicit support for structured QA on an app-specific knowledge base; rather, it leaves it up to the developer to define their own APIs as needed.

Similarly, Wit.ai and Amazon Lex leave it up to developers to set up their own knowledge bases and APIs as needed. Amazon Lex has the added benefit of easily integrating with AWS Lambda, including AWS Elastic Stack. Elasticsearch provides all of the information retrieval functionality needed for advanced QA and is used for the MindMeld question answerer as well. However, the learning curve for Elasticsearch can be steep, and setting up your indices with optimized mappings, templates, analyzers, and filters can be quite cumbersome. These frameworks don’t provide easy to use APIs with defaults optimized for conversational interfaces.

RASA provides QA support for more structured documents via knowledge base actions. While developers can build their own systems for unstructured QA, retrieval of unstructured text is not explicitly supported.

Finally, Microsoft provides support for unstructured QA via their QnA Maker service. It can be used on its own for simple FAQ-style bots, or in conjunction with LUIS for more complex or broader use cases. An optimized service for more structured QA isn’t provided, but developers can set up their own knowledge bases and APIs.

Here’s a summary of how these platforms compare:

With MindMeld, it’s easy to build powerful question answering systems for production applications, which typically require a specific type of question answering. MindMeld provides a developer-friendly way to build app-specific question answering for a wide variety of scenarios and data types.

To learn more about working with MindMeld’s knowledge base and question answerer, check out our user guide.

We welcome every active contribution to our platform. Check us out on GitHub, and send us any questions or suggestions at mindmeld@cisco.com.

--

--