The Startup
Published in

The Startup

Difference Between Standardization & Normalization

This blog aims to explain the most confusing concepts in feature engineering which are Standardization & Normalization. Both look very similar, & most of the time, most of the people fail to understand the difference between them, & the use-case for each of them. But, no worries, this blog will act as a helping hand to make everyone understand the difference between them & their use-cases.


It's completely fine if you feel confused between the topics “Standardization” vs “Normalization”. A few months ago, I was one of you, therefore, I can completely understand this feeling of confusion & sometimes frustrated too, because there was no good & easy resource to explain the topic.

But, there is no need to worry because this blog will not only clear all the doubts between these topics but also provide their use-case i.e. when to use which.

Important Prior Knowledge!

Before explaining the difference between “Standardization” & “Normalization”, let me build the context for that.

Standardization & Normalization, both are part of Feature Engineering which in turn a part of Data Science.

If you want to learn about Data Science Pipeline, then check this blog:

Feature Engineering means to apply your engineering mind & skills to optimize the features so that the model can be effectively & easily trained on those features.

Standardization & Normalization both are used for Feature Scaling(Scaling the features to a specified range instead of being in a large range which is very complex for the model to understand), but both differ in the way they work, & also, they should be used in the specific use-cases(discussed later in this blog).

This much information is enough for setting up the context before explaining the topics. Now, let us jump directly to the main topics.


This concept refers to make the data distribution to be normal.

It transforms the mean of the data to be 0 & its variance to be 1. As the data value tends towards infinity, variance of the data tends towards 1.

For example, consider the data shown below:

Raw Data [Image by Author!]

Now, when standardization is been applied to this data, it will be transformed into the data shown below.

Standardized Data [Image by Author]!

The formula used to apply transformation!

Standardization Formula! [Image by Author!]

In the above image, x is the value in the data, “mu” is the mean of the data, & “sigma” is the variance of the data.

Implementing Standardization to Data!

Code to apply standardization to the data!


This concept refers to transforming the data into the range [0, 1].

Each of the data record in the dataset will be transformed into the range between 0 & 1, so that the data falls under a narrow range which helps the model to learn.

For example, consider the data shown below:

Raw Data [Image by Author!]

Now, when normalization is been applied to this data, it will be transformed into the data shown below.

Normalized Data! [Image by Author!]

The formula used to apply transformation!

Normalization Formula! [Image by Author!]

Captial X min & max values represent the minimum & maximum values in the dataset respectively.

Small x represents a particular data record in the data.

Implementing Normalization to Data!

Code to normalize the data!

Note: Both of the above libraries “StandardScaler” & “MinMaxScaler” are very sensitive to Outliers present in the data because considering each & every data point, values are calculated which are used for standardizing & normalizing the data.

Use-case of Standardizer!

  • In most of the Machine Learning models, it is been used & according to my & many other people's experiences, it outperforms MinMaxScaler(Normalization).
  • Anywhere, where there is no need to scale features in the range 0 to 1.
  • Since, it transforms the normal data distribution to standard normal distribution, which is the ideal & expected to have, most of the time it is the best to use in machine learning models.

Use-case of Normalizer!

  • Every situation where the range of features should be between 0 to 1. For example, in Images data, there we have color pixels range from 0 to 255(256 colors in total), here Normalizer is the best one to use.
  • There can be multiple scenarios where this range is expected, there it is optimal to use MinMaxScaler.

I hope my article explains each and everything related to the topic with all the deep concepts and explanations. Thank you so much for investing your time in reading my blog & boosting your knowledge. If you like my work, then I request you to give an applaud to this blog!



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store