Recurrent / LSTM layers explained in a simple way

A part of series about different types of layers in neural networks

Assaad MOAWAD
DataThings

--

This post is meant to be read after:

Introduction

For all the previously introduced layers, the same output will be generated if we repeat the same input several times. For instance, if we have a linear layer with f(x)=2.x. Each time we ask to predict f(3) we will get 6. So if we ask 10 times in a row, predict us the output when the input is 3, the NN will always give 6:

F(3)=6; F(3)=6; F(3)=6; F(3)=6; F(3)=6; …

Now imagine we are training an algorithm to detect repetitions, so we want that F(3) = 0 for the first time (no repetition detected), then we would like to get F(3)= 1 for the second time. We can’t achieve this behavior with non-recurrent layers. Since by definition we will always get the same output for the same input. A hack solution for this is to take a vector of 2 variables, so we can treat the first variable differently than the second variable. So a F([3;0]) =0 (no repetition is detected) but F([3;3])=1 (repetition is detected). The…

--

--

Assaad MOAWAD
DataThings

Interested in artificial intelligence, machine learning, neural networks, data science, blockchain, technology, astronomy. Co-founder of Datathings, Luxembourg