NLP: Text Data To Numbers

Explaining How We Can Convert Text To Numbers For Data Science Projects

Farhad Malik
FinTechExplained

--

Working on a NLP project can be a tedious task, in particular when the data is in textual format and the models require numerical values. This article explains how we can convert text to numerical values.

Photo by Thiébaud Faix on Unsplash

Handling Categorical Values

Let’s assume we want to forecast a variable e.g. Number Of Tweets and it is dependent on following two variables: Most Active Current News Type and Number Of Active Users.

In this instance, Most Active Current News Type is a categorical feature. It can contain textual data such “Fashion”, “Economical” etc. Additionally, Number Of Active Users contains numerical fields.

Scenario

Before we feed the data set into our model, we need to transform categorical values into numerical values because many models do not work with textual values.

Solution: Dictionary

There are a number of strategies to handle categorical features:

  1. Create a dictionary to map categorical values to numerical values

A dictionary is a data storage structure. It contains a list of key-value paired elements. It enables a key…

--

--

Farhad Malik
FinTechExplained

My personal blog, aiming to explain complex mathematical, financial and technological concepts in simple terms. Contact: FarhadMalik84@googlemail.com