Recurrent / LSTM layers explained in a simple way
A part of series about different types of layers in neural networks
This post is meant to be read after:
- Neural network explained in a simple way
- Linear layers explained in a simple way
- Dense layers explained in a simple way
Introduction
For all the previously introduced layers, the same output will be generated if we repeat the same input several times. For instance, if we have a linear layer with f(x)=2.x. Each time we ask to predict f(3) we will get 6. So if we ask 10 times in a row, predict us the output when the input is 3, the NN will always give 6:
F(3)=6; F(3)=6; F(3)=6; F(3)=6; F(3)=6; …
Now imagine we are training an algorithm to detect repetitions, so we want that F(3) = 0 for the first time (no repetition detected), then we would like to get F(3)= 1 for the second time. We can’t achieve this behavior with non-recurrent layers. Since by definition we will always get the same output for the same input. A hack solution for this is to take a vector of 2 variables, so we can treat the first variable differently than the second variable. So a F([3;0]) =0 (no repetition is detected) but F([3;3])=1 (repetition is detected). The…