NLP: Text Data To Numbers

Explaining How We Can Convert Text To Numbers For Data Science Projects

Published in

FinTechExplained

4 min readJun 25, 2019

Working on a NLP project can be a tedious task, in particular when the data is in textual format and the models require numerical values. This article explains how we can convert text to numerical values.

Handling Categorical Values

Let’s assume we want to forecast a variable e.g. Number Of Tweets and it is dependent on following two variables: Most Active Current News Type and Number Of Active Users.

In this instance, Most Active Current News Type is a categorical feature. It can contain textual data such “Fashion”, “Economical” etc. Additionally, Number Of Active Users contains numerical fields.

Scenario

Before we feed the data set into our model, we need to transform categorical values into numerical values because many models do not work with textual values.

Solution: Dictionary

There are a number of strategies to handle categorical features:

Create a dictionary to map categorical values to numerical values

A dictionary is a data storage structure. It contains a list of key-value paired elements. It enables a key…

NLP: Text Data To Numbers

Explaining How We Can Convert Text To Numbers For Data Science Projects

Handling Categorical Values

Scenario

Solution: Dictionary

Written by Farhad Malik