Word2Vec

Long
3 min readJul 4, 2017

--

  1. Intro
  2. how to establish word2vec model
  3. example

1. Intro

word2vec is a method that use a vector to represent a word, because calculate word in the network directly is difficult.

how to use number vector to represent a word?

  • One-Hot Encoding:if there have 1000 mutual words, and ‘cat’ is in the index of 50, then vector[50] = 1, other is 0.
one-hot encoding

one-hot coding is a very start method to make word digitized, it just represent a word but no meaningful, u can get it from nothing.

  • Skip-gram model: this is the architecture that can abstract a word to a string of numbers(features).
word embedding

word embedding matrix is(vocab_num * feature_num).

2. How to establish word2vec model?

The destination of Wrod2Vec is to train to embedding model,not predict some result.

Step 1:get inputs and targets

the below example shows some of the training samples ,word pairs(input, target), we would take from the sentence “The quick brown fox jumps over the lazy dog.” I’ve used a small window size of 2 just for the example. The word highlighted in blue is the input word.

inputs and targets

every word represented by word pair,same word pair frequency of occurrence is high, nearby word probability is high, vice versa.

embedding model establish process is use statistic frequency to back adjust the embedding matrix.

So, if two words have similar contexts, then our network is motivated to learn similar word vectors for these two words!

similar word vectors

There have some tricks in real train process:

  1. eliminate meaningless(or higher frequency)word from dataset like:a ,the,of ,then etc. Can improve accuracy and training time.
  2. negative sampling:for every input, we’ll update the weights for the correct label, but only a small number of incorrect labels.

3. Example:

word2vec core code gist

full_code

Reference:

--

--