Member-only story

Combining supervised learning and unsupervised learning to improve word vectors

Introduction to Generative Pre-Training

Edward Ma
Towards Data Science
5 min readJan 20, 2019

--

Photo by Edward Ma on Unsplash

To achieve state-of-the-art result in NLP tasks, researchers try tremendous way to let machine understand language and solving downstream tasks such as textual entailment, semantic classification. OpenAI released a new model which named as Generative Pre-Training (GPT).

After reading this article, you will understand:

  • Finetuned Transformer LM Design
  • Architecture
  • Experiments
  • Implementation
  • Take Away

Finetuned Transformer LM Design

This approach includes 2 steps. First of all, model is trained via unsupervised learning based-on a vast amount of data. Second part is using a target data set (domain data) to fine-tune the model from previous step via supervised learning.

Unsupervised Learning

There is no denying that there are unlimited unlabeled data for NLP. Radford et al. believe that leveraging unlimited corpus help to train a model for general purpose just like word2vec (word embeddings) and skip-thought (sentence embeddings). We do not need consider about the…

--

--

Towards Data Science
Towards Data Science

Published in Towards Data Science

Your home for data science and AI. The world’s leading publication for data science, data analytics, data engineering, machine learning, and artificial intelligence professionals.

Edward Ma
Edward Ma

Written by Edward Ma

Focus in Natural Language Processing, Data Science Platform Architecture. https://makcedward.github.io/

Responses (1)