Experimenting with OpenAI’s Improved Language Model

This post introduces a Colab notebook that teaches how to use OpenAI’s GPT-2 LM model.

elvis
DAIR.AI
2 min readFeb 17, 2019

--

In this Colab notebook (find the link below) you can play around with the small version (117M) of Open AI’s GPT-2 Model from the paper Language Models are Unsupervised Multitask Learners.

According to the authors, the GPT-2 algorithm was trained on the task of language modeling — which tests a program’s ability to predict the next word in a given sentence — by ingesting huge numbers of articles, blogs, and websites. By using just this data it achieved state-of-the-art scores on a number of unseen language tests, an achievement known as zero-shot learning. It can also perform other writing-related tasks, such as translating text from one language to another, summarizing long articles, and answering trivia questions.

Open AI decided not to release the dataset, training code, or the full GPT-2 model weights. This is due to the concerns about large language models being used to generate deceptive, biased, or abusive language at scale. Some examples of the applications of these models for malicious purposes are:

  • Generate misleading news articles
  • Impersonate others online
  • Automate the production of abusive or faked content to post on social media
  • Automate the production of spam/phishing content

As one can imagine, this combined with recent advances in generation of synthetic imagery, audio, and video implies that it’s never been easier to create fake content and spread disinformation at scale. The public at large will need to become more skeptical of the content they consume online.

Snapshot of the notebook:

Colab notebook (Link) | GitHub (Link)

Full credit goes to Ignacio López-Francos (@iglpzfr)

--

--