Memory is a fascinating function of the human brain. Specifically, the interplay of the short term memory and long term memory where both work in conjunction to help humans decide how to respond to the current stimuli is what makes us function in the real world..
Let’s take an example —
I am watching a program on TV and suddenly a picture of a dog/wolf comes up. From the looks of it, I cannot distinguish between the two. What is it — a dog or a wolf?
If the previous image was a squirrel and since squirrels are likely to be in a domestic setting, I could make an assessment that the current image is a dog.
This would be reasonably true, if all I had was my short term memory.
At this point, my long term memory kicks in at this point and tells me that I am watching a show about wild animals. Voila, the obvious answer at that point is that the current image is of a wolf.
LSTMs mimic human memory
A specific branch under Deep Learning in AI called LSTMs (long short term memory) are used to solve problems that have temporal (or time based) dependencies. In other words, LSTMs are used to mimic the human memory to predict outcomes. Unlike another branch in Deep Learning called RNNs (recurrent NNs) which only keep the short term memory around, LSTMs bring in long term memory to bring in high fidelity to their predictions.
The Working of LSTMs
The NN takes in the long term memory (Elephant), the short term memory (Squirrels, Trees) and the input (Dog/Wolf) and makes the following set of determination:
- What should it forget? Trees in this case because the show is about wild animals and not trees.
- What should it learn? There is a Dog/Wolf in addition to the squirrels and trees.
- What should it predict? The Wolf
- What should it remember long term? Elephant, Squirrel, Wolf
- What should it remember short term? Squirrel, Wolf
All the above is done for the current time and the new long/short term memory are fed into the next input that comes at time t+1.
Thus, you can think of the above picture as recurring for every time epoch t.
In the next blog, we will deep dive into the LSTM NN and see how each of the bulleted questions are answered.
Pretty interesting, isn’t it?
(disclaimer: the example used is from the Deep Learning course work on Udacity)