A new visualization tool shows how BERT forms its distinctive attention patterns.

Image for post
Image for post

In Part 1 (not a prerequisite) we explored how the BERT language understanding model learns a variety of interpretable structures. In Part 2, we will drill deeper into BERT’s attention mechanism and reveal the secrets to its shape-shifting superpowers.

🕹 Try out an interactive demo at the BertViz github page.

Giving machines the ability to understand natural language has been an aspiration of Artificial Intelligence since the field’s inception, but this goal has proved elusive. In some sense, understanding language requires solving the larger problem of artificial general intelligence (AGI). …


How the super-sized language model is able to finish your thoughts.

Image for post
Image for post

In the eyes of most NLP researchers, 2018 was a year of great technological advancement, with new pre-trained NLP models shattering records on tasks ranging from sentiment analysis to question answering.

But for others, 2018 was the year that NLP ruined Sesame Street forever.

First came ELMo (Embeddings from Language Models) and then BERT (Bidirectional Encoder Representations from Transformers), and now BigBird sits atop the GLUE leaderboard. …


From BERT’s tangled web of attention, some intuitive patterns emerge.

Image for post
Image for post

The year 2018 marked a turning point for the field of Natural Language Processing, with a series of deep-learning models achieving state-of-the-art results on NLP tasks ranging from question answering to sentiment classification. Most recently, Google’s BERT algorithm has emerged as a sort of “one model to rule them all,” based on its superior performance over a wide variety of tasks.

BERT builds on two key ideas that have been responsible for many of the recent advances in NLP: (1) the transformer architecture and (2) unsupervised pre-training. The transformer is a sequence model that forgoes the recurrent structure of RNN’s…


Image for post
Image for post

The year is 1982. Michael Jackson’s Thriller is the best-selling album in the world. Surgeons recently implanted the world’s first artificial heart. Time magazine just selected “The Computer” as Man of the Year. But none of that matters at the moment because you’re 30 minutes into an 8-hour family car trip and you are bored out of your mind.

Do you entertain yourself by pulling up Instagram on your iPhone XS? No, Apple is still preparing for the release of the eagerly awaited Apple IIe. Do you check out the latest Harry Potter book? Not so fast, J. K. Rowling…

Jesse Vig

NLP Researcher • jessevig.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store