Encoding Categorical Variables

Label Encoding and One Hot Encoding

Swati Sinha
WiCDS

--

Photo by Bailey Granneman on Unsplash

EDA is one of the important steps in any Data Science journey. We have a common doubt while performing EDA; on how to handle the categorical variables. Most of the ML algorithms support numerical variables. Hence, it is required to convert the object or categorical variables into numeric form that the algorithm can understand.

We will discuss the most widely used techniques for the categorical variable’s conversion.

1. Label Encoding

2. One-hot Encoding

Let’s explore the same with the below dataset of Passenger.

This dataset has various columns i.e. name, gender, age, package, TicketCost and Destination.

The attributes Name, Gender, Package and Destination are object data type i.e. categorical type.

Label Encoding

--

--