Word2Vec

3 min readJul 4, 2017

Intro
how to establish word2vec model
example

1. Intro

word2vec is a method that use a vector to represent a word, because calculate word in the network directly is difficult.

how to use number vector to represent a word?

One-Hot Encoding:if there have 1000 mutual words, and ‘cat’ is in the index of 50, then vector[50] = 1, other is 0.

one-hot coding is a very start method to make word digitized, it just represent a word but no meaningful, u can get it from nothing.

Skip-gram model: this is the architecture that can abstract a word to a string of numbers(features).

word embedding matrix is(vocab_num * feature_num).

2. How to establish word2vec model?

The destination of Wrod2Vec is to train to embedding model,not predict some result.

Step 1:get inputs and targets

the below example shows some of the training samples ,word pairs(input, target), we would take from the sentence “The quick brown fox jumps over the lazy dog.” I’ve used a small window size of 2 just for the example. The word highlighted in blue is the input word.

every word represented by word pair,same word pair frequency of occurrence is high, nearby word probability is high, vice versa.

embedding model establish process is use statistic frequency to back adjust the embedding matrix.

So, if two words have similar contexts, then our network is motivated to learn similar word vectors for these two words!

There have some tricks in real train process:

eliminate meaningless(or higher frequency)word from dataset like:a ,the,of ,then etc. Can improve accuracy and training time.
negative sampling:for every input, we’ll update the weights for the correct label, but only a small number of incorrect labels.

3. Example:

word2vec core code gist

full_code

Reference:

Word2Vec Tutorial - The Skip-Gram Model

This tutorial covers the skip gram neural network architecture for Word2Vec. My intention with this tutorial was to…

mccormickml.com

Word2Vec Tutorial Part 2 - Negative Sampling · Chris McCormick

In part 1 of this tutorial, I showed how training samples were created from the source text, but I'll repeat it here…