Towards an intelligent Question Answering System (Chatbot) with Memory Networks

Purnendu Mukherjee
Aug 23, 2017 · 3 min read

The reason Deep Learning has taken the world by storm and is the vanguard for the growth of Artificial Intelligence is not just its “unreasonable effectiveness”, but also its core inherent philosophy. Deep Learning often draws parallel with human cognition and has high similarity with how we mortals think and learn — a primary inspiration for the growing army of researchers in this field. While the biggest waves were created by Convolutional Neural Networks and Recurrent Neural Networks which does capture how we form our visual and sequential memories, their memory (encoded by hidden states and weights) were typically too small, and was not compartmentalized enough to accurately remember facts from the past (knowledge is compressed into dense vectors) [1].

Deep Learning needed to cultivate a methodology that preserved memories as they are, such that it wont be lost in generalization and recalling exact words or sequence of events would be possible — something computers are already good at! This effort led us to Memory Networks, which was also the title of the paper published at ICLR 2015 by Facebook AI Research.

This paper provides a basic framework to store, augment and retrieve memories while seamlessly working with a Recurrent Neural Network architecture. While the authors didn’t explicitly draw parallel with how we humans store memories in our brain, I’d like to do that. Whenever we learn something we try to ground it into what we already know. Also, you might have noticed that certain words when heard or pictures when seen might evoke memories in us. This happens as our (human) memory network is constantly updated with new connections/synapses and whatever is important to us is brought to our ‘attention’. Thus, new memories should be able to update previous relevant memories.

Now let’s see the practical side of the architecture as described in the paper. [1] The memory network consists of a memory m (an array of objects 1 indexed by m i) and four (potentially learned) components I, G, O and R as follows:

I: (input feature map) — converts the incoming input to the internal feature representation, either a sparse or dense feature vector like that from word2vec or GloVe.

G: (generalization) — updates old memories given the new input. They call this generalization as there is an opportunity for the network to compress and generalize its memories at this stage for some intended future use. The analogy I’ve been talking before.

O: (output feature map) — produces a new output (in the feature representation space), given the new input and the current memory state. This component is responsible for performing inference. In a question answering system, this part will select the candidate sentences (which might contain the answer) from the story (conversation) so far.

R: (response) — converts the output into the response format desired. For example, a textual response or an action. In the QA system described, this component finds the desired answer and then converts it from feature representation to the actual word.

This model is a fully supervised model, meaning all the candidate sentences from which the answer could be found are marked during training phase and can also be termed as ‘hard attention’.

The authors tested out the QA system on various literature including Lord of the Rings:

Image taken from Memory Networks paper [1]

Another example from the bAbI dataset, which is Question Answering dataset containing 20 different logical reasoning tasks.

Although the first wave of research publications in Question Answering systems with Deep Learning were brought by Memory Networks, current research is ruled by Attention based methods. The next part, in which I will be talking about Attention based Question Answering system, is coming soon. :)

)
Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade