TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Member-only story

Contour Plots and Word Embedding Visualisation in Python

Contour plots are simple and very useful graphics for word embedding visualization. This end-to-end tutorial uses IMDb data to illustrate coding in Python.

Petr Korab
TDS Archive
Published in
5 min readNov 15, 2022

--

Figure 1. Contour plot example. Source: commons.wikimedia.org.

Introduction

Text data vectorization is a necessary step in modern Natural Language Processing (NLP). The underlying concept of word embeddings has been popularised by two Word2Vec models developed by Mikolov et al. (2013a) and Mikolov et al. (2013b). Vectorizing text data in high-dimensional space, we can tune ML models for tasks like machine translation and sentiment classification.

Word embeddings are typically plotted as 3-D graphs, which brings some difficulties in presenting them easily in papers and presentations. Here, the contour plot is a simplification of how to tackle 3-dimensional data visualization, not only in NLP but originally in many other disciplines. Contour plots can be used to present word embeddings (i.e., vector representations of words) in a 2-D dimensional graph. This article provides a step-by-step tutorial for the visualization of word embedding in 2-D space and develops the use cases where contour plots can be used.

What are contour plots?

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Petr Korab
Petr Korab

Written by Petr Korab

Python engineer /NLP / data Viz. Text Mining Stories founder textminingstories.com

Responses (2)