Neural Networks for Word Embeddings: Introduction to Natural Language Processing Part 3

Published in

Analytics Vidhya

7 min readOct 14, 2018

This is the third part of my series on Introduction to NLP.

In Part 1, we talked about the Bag of Words model, a naive representation of language which creates vectors based on the number of times a term is found in a document. In Part 2, we used the Tf-idf vectorizer to classify text messages, which takes the Bag of Words model and re-calculates the weighting based on the relevance of the term to document.

Now, we will explore the much more complex set of embeddings created using shallow neural networks by focusing on word2vec models. Trained over large corpora, word2vec uses unsupervised learning to determine semantic and syntactic meaning from word co-occurrence, which is used to construct vector representations for every word in the vocabulary.

This article will provide a high level overview of how word2vec works and go through an example of training your own model.

Word2vec was developed at Google by a research team led by Tomas Mikolov. The research paper takes a while to read, but is worth the time and effort.

The model uses a two layer shallow neural network to find the vector mappings for each word in the corpus. The neural network is used to predict known co-occurrences in the corpus…

Neural Networks for Word Embeddings: Introduction to Natural Language Processing Part 3

Written by Mandy Gu