Member-only story
The Intuition Behind Transformers — Attention is All You Need
Traditionally recurrent neural networks and their variants have been used extensively for Natural Language Processing problems. In recent years, transformers have outperformed most RNN models. Before looking at transformers, let's revisit recurrent neural networks, how they work, and where they fall behind.
Recurrent Neural Networks (RNN) work with sequential data like language translation and time-series data. There are different types of recurrent neural networks.
- Vector to Sequence Models — That take in a vector and return a sequence of any length.
- Sequence To Vector Models — These models take in a sequence as input and return a vector as an output. For instance, these models are commonly used in sentiment analysis problems.
- Sequence To Sequence Models, as you guessed it by now, they take a sequence as the input and output another sequence. They are commonly seen in language translation applications.
Natural Language Processing and RNNs
When it comes to natural language processing RNNs, they work in an encoder-decoder…