xLSTM vs. Transformers: Who Will Win?

Published in

AIGuys

9 min readMay 22, 2024

Transformers have been the hype for quite some time. But before the quick rise of Transformers, LSTMs were the kings. LSTM or Long Short Term Memory was invented to solve the issues of the Recurrent Neural Network vanishing Gradient problem. Recently there was a lot of hype about Mamba, a state space model, LSTM could be thought of as a precursor to the state space models. But today, we are discussing a newer version of the LSTM called xLSTM, something that can not only compete with Transformers but in some cases even outclass them.

If you want to read more about Mamba: click here

So, without further ado, let’s jump right into this brand new exciting research.

Understanding RNN
An Intro to LSTM
What Does xLSTM Offer?
Decoding sLSTM and mLSTM Block
xLSTM Architecture
Conclusion

Understanding RNN

Recurrent Neural Networks are a very special kind of network, although constrained in many ways, but they also offer some very incredible properties. A glaring limitation of Vanilla Neural Networks (and also Convolutional Networks) is that their structure is too constrained: they accept a…

xLSTM vs. Transformers: Who Will Win?

Table of Contents

Understanding RNN

Written by Vishal Rajput