Recipe: Practical Transformers
Creators: Suhas Pai, Nabila Abraham
Objective: Enable users to fine-tune transformers for their own NLP tasks
Audience Level: Intermediate
Main Concept: Transformer based architecture is the main concept you will learn through the following resources
Background Concept: You need to know attention mechanism in order to learn the main concept
Subsequent Concept: Once you know the main concept you can learn Longformer, DistillBert, TransformerXL
Resources
You should go through the following resources in the order that is provided:
- The Illustrated Transformer
Type: Blog post; Theory, Main Concept
Estimated time commitment: 60 mins
Why is this a good resource: This blog introduces an intuitive way to visualize transformers and the Query, Key and Value matrices
How to use this resource: Read the entire blog post, pay attention to the visuals
Instructor: Jay Alammar
Link: http://jalammar.github.io/illustrated-transformer/
2. [Transformer] Attention is All You Need
Type: Video; Theory, Main Concept
Estimated time commitment: 60 mins
Why is this a good resource: This video is a group discussion of transformer networks which should help you find answers to most frequently encountered confusions about this topic
How to use this resource: watch the entire video
Speaker: Joe Palermo
Date Created: Oct 2018
Link: https://ai.science/e/transformer-networks-attention-is-all-you-need--2018-10-22
3. PyTorch Transformers Tutorials
Type: GitHub Repo; Implementation, Main Concept
Estimated time commitment: 90 mins
Why is this a good resource: This is an easy-to-understand code tutorial implementing Transformers
How to use this resource: Run the notebooks
Language: Python
Creator: Abhi Mishra
Computational Resource Need: Medium
Repo Last Updated: 2020
Link: https://github.com/abhimishra91/transformers-tutorials
4. BERT: State of the Art NLP Model, Explained
Type: Blog Post; Application, Subsequent Concept
Estimated time commitment: 20 mins
Why is this a good resource: This blog post outlines some of the most important use cases that leverage. It also provides some good estimates about what kind of compute is necessary for using BERT in real life
How to use this resource: Read the post for different use cases of BERT
Authors: Rani Hover
Link: https://www.kdnuggets.com/2018/12/bert-sota-nlp-model-explained.html