Day 67 of 100DaysofML

Charan Soneji
100DaysofMLcode
Published in
2 min readSep 3, 2020

LSTM. Since I used an LSTM model in my last blog, I thought of covering the basics of LSTM models which are very commonly used. LSTM’s can be defined to be a type of RNN and are used for things such as sequence prediction so what's different about LSTM as compared to RNN? LSTM works on the disadvantages of RNN such as Short term memory and gradient issue which is commonly observed in RNN.

Sequence prediction problems are very common and one of the most commonly seen applications is seen on the keyboard that you use on your phone — be in android or apple.

Concept

When we arrange our calendar for the day, we prioritize our appointments right? If in case we need to make some space for anything important we know which meeting could be canceled to accommodate a possible meeting.

Turns out that an RNN doesn’t do so. In order to add a new information, it transforms the existing information completely by applying a function. Because of this, the entire information is modified, on the whole, i. e. there is no consideration for ‘important’ information and ‘not so important’ information.

LSTMs on the other hand, make small modifications to the information by multiplications and additions. With LSTMs, the information flows through a mechanism known as cell states. This way, LSTMs can selectively remember or forget things. The information at a particular cell state has three different dependencies.

Architecture

Both RNNs and LSTMs pass data as it propagates forward. However, unlike in RNNs, LSTMs makes use of gates to decide if they should keep or forget information.

An LSTM layer consists of a set of recurrently connected blocks, known as memory blocks. These blocks can be thought of as a differentiable version of the memory chips in a digital computer. Each one contains one or more recurrently connected memory cells and three multiplicative units — the input, output and forget gates — that provide continuous analogues of write, read and reset operations for the cells. … The net can only interact with the cells via the gates.

I shall work on a simple example in code in my upcoming blog but I shall wrap up the basics with the below given video.

Thanks for reading. Keep Learning.

Cheers.

--

--