Long Short-Term Memory (1997)| one minute summary

Do you remember the key points of Long Short-Term Memory?

Jeffrey Boschman
One Minute Machine Learning
1 min readApr 25, 2021

--

The seminal paper by Hochreiter and Schmidhuber (1997) has greatly influenced recurrent neural networks (RNNs).

  1. Standard RNNs struggle with learning relevant information that would be useful for tasks further into the future (e.g. given: “I grew up in France… <add a few sentences> …I speak ___”, it would not know to infer the output is likely “French”)
  2. Standard RNNs decide output based on the input and hidden state information from the previous networks → long short-term memory (LSTM)-based models introduce another flow of information: the cell state/constant error carrousel information highway between networks which can store context (and four interacting layers to control it)
  3. Three of these new layers update the cell state based on the input and hidden state info, while the fourth layer combines them with the cell state to provide the output/hidden state information for the next network

See this wonderful blog post for a more in-depth explanation and diagrams.

--

--

Jeffrey Boschman
One Minute Machine Learning

An endlessly curious grad student trying to build and share knowledge.