One step forward and the other step backward — Down the memory lane! Not so long ago, Recurrent Neural Networks were the go-to architectures for just about anything that had a sequential nature, most notably for text data. Variants of RNN’s like GRU, LSTM were used for text classification, paraphrasing, language modeling, token classification, and other non-standard problems. However, LSTM models were starting…