Member-only story

Understanding Transformers, the Programming Way

Because you can only understand it, if you can program it

--

Transformers have become the defacto standard for NLP tasks nowadays. They started being used in NLP but they are now being used in Computer Vision and sometimes to generate music as well. I am sure you would all have heard about the GPT3 Transformer or the jokes thereof.

But everything aside, they are still hard to understand as ever. In my last post, I talked in quite a detail about transformers and how they work on a basic level. I went through the encoder and decoder architecture and the whole data flow in those different pieces of the neural network.

But as I like to say we don’t really understand something before we implement it ourselves. So in this post, we will implement an English to German language translator using Transformers.

Task Description

We want to create a translator that uses transformers to convert English to German. So, if we look at it as a black-box, our network takes as input an English sentence and returns a German sentence.

--

--

Towards Data Science
Towards Data Science

Published in Towards Data Science

Your home for data science and AI. The world’s leading publication for data science, data analytics, data engineering, machine learning, and artificial intelligence professionals.

Rahul Agarwal
Rahul Agarwal

No responses yet