Sitemap
TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Photo by Mor Shani on Unsplash

Member-only story

The Intuition Behind Transformers — Attention is All You Need

8 min readNov 1, 2020

--

Traditionally recurrent neural networks and their variants have been used extensively for Natural Language Processing problems. In recent years, transformers have outperformed most RNN models. Before looking at transformers, let's revisit recurrent neural networks, how they work, and where they fall behind.

Recurrent Neural Networks (RNN) work with sequential data like language translation and time-series data. There are different types of recurrent neural networks.

Source Wikipedia
  • Vector to Sequence Models — That take in a vector and return a sequence of any length.
  • Sequence To Vector Models — These models take in a sequence as input and return a vector as an output. For instance, these models are commonly used in sentiment analysis problems.
  • Sequence To Sequence Models, as you guessed it by now, they take a sequence as the input and output another sequence. They are commonly seen in language translation applications.

Natural Language Processing and RNNs

When it comes to natural language processing RNNs, they work in an encoder-decoder

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Sam Palani
Sam Palani

Written by Sam Palani

Machine Learning & AI Specialist @ AWS. ❤ = Travel, Books & Jazz. {samx18 @ most places online} Views are my own.

Responses (1)