Understanding Pointwise Mutual Information in NLP

An implementation with Python

Published in

DataSeries

5 min readJan 31, 2020

Natural Language Processing (NPL) is a field of Artificial Intelligence whose purpose is finding computational methods to interpret human language as it is spoken or written. The idea of NLP goes beyond a mere classification task which could be carried on by ML algorithms or Deep Learning NNs. Indeed, NLP is about interpretation: you want to train your model not only to detect frequent words, to count them or to eliminate some noisy punctuations; you want it to tell you whether the mood of the conversation is positive or negative, whether the content of an e-mail is mere publicity or something important, whether the reviews about thriller books in last years have been good or bad.

However, it is more cumbersome treating words, documents, topics as numerical inputs rather than working with numbers or images. Indeed, even NLP uses deep neural nets techniques to solve a given task and, by definition, each and every algorithm can only accept numeric variables as inputs. So, how can we ‘quantify’ a word?

The idea behind the NLP algorithm is that of transposing words into a vector space, where each word is a D-dimensional vector of features. By doing so, we can compute some quantitative metrics of words and between words, namely their cosine similarity. Needless to…

Understanding Pointwise Mutual Information in NLP

An implementation with Python

Written by Valentina Alto