Word Vectorization: A Revolutionary Approach In NLP

Anuj Syal
Analytics Vidhya
Published in
7 min readAug 19, 2020

--

Photo by Hope House Press — Leather Diary Studio on Unsplash

Language allows humans to communicate their ideas for enhanced understanding. Similarly, in AI and ML, the use of Natural Language Processing (NLP) advances deep learning models for input which can also be non-numerical.

In NLP, a methodology called Word Embeddings or Word Vectorization is used to map words or phrases from vocabulary to a corresponding vector of real numbers to enable word predictions, word similarities/semantics. This process of converting words into numbers is called Vectorization.

Word Vector…What?

To help those new to AI mumbo-jumbo, I will try to explain it in a simpler manner. Take, for example, you are talking to someone who does not know or understand your language. So you use gestures or objects to explain to them an idea. Word Vectorization can also be understood in the same manner. For deep learning models, comprehending text or words in their original form is not possible. Therefore, Word Vectorization turns individual words into vectors for easy consumption and comprehension by the machine learning algorithm.

What is a Vector?

Vector denotes the mathematical or geometrical representation quantity. Consider a vector of geometrical point P [2, 3, 4]. This vector basically represents the point P in…

--

--