Member-only story
Combining supervised learning and unsupervised learning to improve word vectors
Introduction to Generative Pre-Training
To achieve state-of-the-art result in NLP tasks, researchers try tremendous way to let machine understand language and solving downstream tasks such as textual entailment, semantic classification. OpenAI released a new model which named as Generative Pre-Training (GPT).
After reading this article, you will understand:
- Finetuned Transformer LM Design
- Architecture
- Experiments
- Implementation
- Take Away
Finetuned Transformer LM Design
This approach includes 2 steps. First of all, model is trained via unsupervised learning based-on a vast amount of data. Second part is using a target data set (domain data) to fine-tune the model from previous step via supervised learning.
Unsupervised Learning
There is no denying that there are unlimited unlabeled data for NLP. Radford et al. believe that leveraging unlimited corpus help to train a model for general purpose just like word2vec (word embeddings) and skip-thought (sentence embeddings). We do not need consider about the…