Most lame soul ever
See more
LSTMs solve the problem by creating a connection between the forget gate activations and the gradients computation, this connection creates a path for information flow through the forget gate for information the LSTM should not forget.
The LSTM has three gates that update and control the cell states, these are the forget gate, input gate and output gate. The gates use hyperbolic tangent and sigmoid activation functions.
However, RNNs suffer from the problem of vanishing gradients, which hampers learning of long data sequences. The …