NLP: Text Data To Numbers
Explaining How We Can Convert Text To Numbers For Data Science Projects
Working on a NLP project can be a tedious task, in particular when the data is in textual format and the models require numerical values. This article explains how we can convert text to numerical values.
Handling Categorical Values
Let’s assume we want to forecast a variable e.g. Number Of Tweets and it is dependent on following two variables: Most Active Current News Type and Number Of Active Users.
In this instance, Most Active Current News Type is a categorical feature. It can contain textual data such “Fashion”, “Economical” etc. Additionally, Number Of Active Users contains numerical fields.
Scenario
Before we feed the data set into our model, we need to transform categorical values into numerical values because many models do not work with textual values.
Solution: Dictionary
There are a number of strategies to handle categorical features:
- Create a dictionary to map categorical values to numerical values
A dictionary is a data storage structure. It contains a list of key-value paired elements. It enables a key…