Do-BERT

Published in

Google Developer Experts

2 min readDec 5, 2022

BERT (Bidirectional Encoder Representations from Transformers) has taken the world of NLP (Natural Language Processing) by storm.

Language-text is essentially a sequence of words. So, traditional methods like RNNs (Recurrent Neural Networks) and LSTMs (Long Short Term Memory) used to be ubiquitous in Language Modeling (predicting next word. Remember, typing SMS?). But they would not remember previous words a bit far away. Then came ‘Attention is All you need’ and its architecture called, `Transformer’.

BERT is a Transformer-based machine learning technique for NLP pre-training developed by in 2018 by Jacob Devlin and his colleagues from Google.

Following sketchnote gives overview of BERT:

References

“Transformer: A Novel Neural Network Architecture for Language Understanding” — Google AI Blog (link)
“A Visual Guide to Using BERT for the First Time” — Jay Alammar (link)
“The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning)” — Jay Alammar (link)
“The Illustrated Transformer” — Jay Alammar (link)
“Explaining BERT Simply Using Sketches” — Rahul Agarwal (link)
“Attention Is All You Need” — Ashish Vaswani et al. (link)

Originally published at LinkedIn

Do-BERT

References

Written by Yogesh Haribhau Kulkarni (PhD)