xLSTM vs. Transformers: Who Will Win?

Vishal Rajput
AIGuys
Published in
9 min readMay 22, 2024

--

Transformers have been the hype for quite some time. But before the quick rise of Transformers, LSTMs were the kings. LSTM or Long Short Term Memory was invented to solve the issues of the Recurrent Neural Network vanishing Gradient problem. Recently there was a lot of hype about Mamba, a state space model, LSTM could be thought of as a precursor to the state space models. But today, we are discussing a newer version of the LSTM called xLSTM, something that can not only compete with Transformers but in some cases even outclass them.

If you want to read more about Mamba: click here

So, without further ado, let’s jump right into this brand new exciting research.

Table of Contents

  • Understanding RNN
  • An Intro to LSTM
  • What Does xLSTM Offer?
  • Decoding sLSTM and mLSTM Block
  • xLSTM Architecture
  • Conclusion
Photo by Soner Eker on Unsplash

Understanding RNN

Recurrent Neural Networks are a very special kind of network, although constrained in many ways, but they also offer some very incredible properties. A glaring limitation of Vanilla Neural Networks (and also Convolutional Networks) is that their structure is too constrained: they accept a…

--

--