Ordinal Encoding — A Brief Explanation

3 min readJul 25, 2023

Article level: Beginner

My clients often ask me about the specifics of certain data pre-processing methods, why they’re needed, and when to use them. I will discuss a few common (and not-so-common) preprocessing methods in a series of articles on the topic.

In this preprocessing series:

Data Standardization — A Brief Explanation — Beginner
Data Normalization — A Brief Explanation — Beginner
One-hot Encoding — A Brief Explanation — Beginner
Ordinal Encoding — A Brief Explanation — Beginner
Missing Values in Dataset Preprocessing — Intermediate
Text Tokenization and Vectorization in NLP — Intermediate

Outlier Detection in Dataset Preprocessing — Intermediate

Feature Selection in Data Preprocessing — Advanced

In this specific short writeup I will explain what Ordinal Encoding is generally about. This article is not overly technical, but some understanding of specific terms would be helpful, so I attached a short explanation of the more complicated terminology. Give it a go, and if you need more info, just ask in the comments section!

preprocessing technique — Transforming raw data before modeling to improve performance.

ordinal: Having a relative ordering or sequence.

numerical input features: Inputs represented as numbers.

one-hot encoding: Creating binary columns for categories.

binary columns: Features with 0/1 values.

sequential ordering: A meaningful order.

nominal categories: Labels without order.

Ordinal Encoding

The Why

Ordinal encoding is a preprocessing technique used for converting categorical data into numeric values that preserve their inherent ordering. It is useful when working with machine learning models like neural networks that expect numerical input features. Ordinal encoding provides two key benefits:

A) Encoding categorical data into numeric forms that algorithms can understand.

B) Retaining the ordinal information between categories that is lost with one-hot encoding.

The How

Ordinal encoding works by mapping each unique category value to a different integer. Typically, integers start at 0 and increase by 1 for each additional category.

For example, a “size” variable with values [“small”, “medium”, “large”] would be mapped to [0, 1, 2]. The ordinal relationships are maintained — “small” < “medium” < “large”.

In contrast, one-hot encoding converts these to 3 binary columns without any implicit ordering. Ordinal encoding uses a single integer feature, keeping the data more compact.

Additional Considerations

Ordinal encoding assumes a meaningful sequential ordering between categories. It should not be used for nominal categories like [“red”, “green”, “blue”] that lack an order.
The integers used should evenly space categories. If gaps are left, some models may incorrectly infer closeness between categories.
Ordinal encoding works well for categories with a small number of values, but can become cumbersome if many categories exist.

Useful Python Code

Option 1: Using numpy, and very simple code:

import numpy as np

sizes = ['small', 'medium', 'large']

# Encode colors as incremental integers 
encoded_sizes = np.arange(len(sizes))

print(encoded_sizes)

This will output the following:

[0 1 2]

Option 2: Using the scikit-learn library (this is the preferred method):

import numpy as np
from sklearn.preprocessing import OrdinalEncoder

encoder = OrdinalEncoder()

sizes = ["small", "medium", "large"]
# reshape to 2D array
sizes = np.array(sizes).reshape(-1,1)

encoded = encoder.fit_transform(sizes) 

print(encoded)