NLP : What are Word Vectors ?

Have you ever thought of adding two words to get another one ? However , its second nature to me during my school days, to get some sort of relief from the boring lectures 😛 .

In those days, what we really did was — we assign a number to each letter and then performed operations on corresponding letters. Even though meaningful results are few and far between, we were happy that we get atleast something.

So far so good , right ? Now , what’s a word vector ? It’s just vector representation of a word such that it is meaningful in some sense.

As simple as that — for every word in the corpus, we have a vector. Here, in the above example, a word is represented as how it is related to other words in the corpus.

The following can be inferred from the above word vector:

  • The words I and like are likely to co-occur. (the value 2 corresponding to I and like)
  • Deep and NLP are some how related.(Since second place of both the vectors are 1 . literally related in the sense that — it is something that someone liked.)

Such inferences can be obtained from a word vector. Quality of these inferences usually depends on the input corpus and the algorithm we used to train the vectors. We will cover those algorithms in detail later.

We will wind up this session quoting the use-cases of word vectors.

Interesting use cases of Word Vector !

Word Arithmetics : We will be able to perform arithmetic operations on words.

For example,

  • Cat + Young =Kitten
  • Puppy + Old = Dog

Word Analogy : Identifying the relationship between two words, we will be able to predict the other word in the pair.

For example,

  • Man : King :: Woman : Queen
  • Kerala : Trivandrum :: Tamilnadu : Chennai

Related words : word embeddings will be able to predict nearest / related word

For example, the word electricity can be related to megawatts, kwh, electric, electrical_grid.

& & & …

Further details will be covered in the upcoming posts.