# Feature Scaling with Python’s scikit-learn

One of the primary objectives of normalization is to bring the data close to zero. That makes the optimization problem more “numerically stable”.

Now, the scaling using mean and standard deviation assumes that the data is normally distributed, that is, most of the data is sufficiently close to the mean. So shifting the mean to zero ensures that most components of most data points are close to 0. Specifically, 68% of data would be between -1 and 1, as can be seen from the following figure:

In this post we explore 3 methods of feature scaling that are implemented in scikit-learn:

• `StandardScaler`
• `MinMaxScaler`
• `RobustScaler`
• `Normalizer`

# Standard Scaler

The mean and standard deviation are calculated for the feature and then the feature is scaled based on:

If data is not normally distributed, this is not the best scaler to use.

Let’s take a look at it in action:

In :

`import pandas as pdimport numpy as npfrom sklearn import preprocessingimport matplotlibimport matplotlib.pyplot as pltimport seaborn as sns%matplotlib inlinematplotlib.style.use('ggplot')`

In :

`np.random.seed(1)df = pd.DataFrame({    'x1': np.random.normal(0, 2, 10000),    'x2': np.random.normal(5, 3, 10000),    'x3': np.random.normal(-5, 5, 10000)})scaler = preprocessing.StandardScaler()scaled_df = scaler.fit_transform(df)scaled_df = pd.DataFrame(scaled_df, columns=['x1', 'x2', 'x3'])fig, (ax1, ax2) = plt.subplots(ncols=2, figsize=(6, 5))ax1.set_title('Before Scaling')sns.kdeplot(df['x1'], ax=ax1)sns.kdeplot(df['x2'], ax=ax1)sns.kdeplot(df['x3'], ax=ax1)ax2.set_title('After Standard Scaler')sns.kdeplot(scaled_df['x1'], ax=ax2)sns.kdeplot(scaled_df['x2'], ax=ax2)sns.kdeplot(scaled_df['x3'], ax=ax2)plt.show()`

All features are now on the same scale relative to one another.

# Min-Max Scaler

It essentially shrinks the range such that the range is now between 0 and 1 (or -1 to 1 if there are negative values).

This scaler works better for cases in which the standard scaler might not work so well. If the distribution is not Gaussian or the standard deviation is very small, the min-max scaler works better.

However, it is sensitive to outliers, so if there are outliers in the data, you might want to consider the `Robust Scaler` below.

For now, let’s see the `min-max` scaler in action

In :

`df = pd.DataFrame({    # positive skew    'x1': np.random.chisquare(8, 1000),    # negative skew     'x2': np.random.beta(8, 2, 1000) * 40,    # no skew    'x3': np.random.normal(50, 3, 1000)})scaler = preprocessing.MinMaxScaler()scaled_df = scaler.fit_transform(df)scaled_df = pd.DataFrame(scaled_df, columns=['x1', 'x2', 'x3'])fig, (ax1, ax2) = plt.subplots(ncols=2, figsize=(6, 5))ax1.set_title('Before Scaling')sns.kdeplot(df['x1'], ax=ax1)sns.kdeplot(df['x2'], ax=ax1)sns.kdeplot(df['x3'], ax=ax1)ax2.set_title('After Min-Max Scaling')sns.kdeplot(scaled_df['x1'], ax=ax2)sns.kdeplot(scaled_df['x2'], ax=ax2)sns.kdeplot(scaled_df['x3'], ax=ax2)plt.show()`

Notice that the skewness of the distribution is maintained but the 3 distributions are brought into the same scale so that they overlap.

# Robust Scaler

For each feature.

Of course, this means it is using less of the data for scaling so it’s more suitable for when there are outliers in the data.

Let’s take a look at this one in action on some data with outliers

In :

`x = pd.DataFrame({    # Distribution with lower outliers    'x1': np.concatenate([np.random.normal(20, 1, 1000), np.random.normal(1, 1, 25)]),    # Distribution with higher outliers    'x2': np.concatenate([np.random.normal(30, 1, 1000), np.random.normal(50, 1, 25)]),})scaler = preprocessing.RobustScaler()robust_scaled_df = scaler.fit_transform(x)robust_scaled_df = pd.DataFrame(robust_scaled_df, columns=['x1', 'x2'])scaler = preprocessing.MinMaxScaler()minmax_scaled_df = scaler.fit_transform(x)minmax_scaled_df = pd.DataFrame(minmax_scaled_df, columns=['x1', 'x2'])fig, (ax1, ax2, ax3) = plt.subplots(ncols=3, figsize=(9, 5))ax1.set_title('Before Scaling')sns.kdeplot(x['x1'], ax=ax1)sns.kdeplot(x['x2'], ax=ax1)ax2.set_title('After Robust Scaling')sns.kdeplot(robust_scaled_df['x1'], ax=ax2)sns.kdeplot(robust_scaled_df['x2'], ax=ax2)ax3.set_title('After Min-Max Scaling')sns.kdeplot(minmax_scaled_df['x1'], ax=ax3)sns.kdeplot(minmax_scaled_df['x2'], ax=ax3)plt.show()`

Notice that after Robust scaling, the distributions are brought into the same scale and overlap, but the outliers remain outside of the bulk of the new distributions.

However, in Min-Max scaling, the two normal distributions are kept separate by the outliers that are inside the 0–1 range.

# Normalizer

Say your features were x, y, and z Cartesian co-ordinates your scaled value for x would be:

Each point is now within 1 unit of the origin on this Cartesian coordinate system.

In :

`from mpl_toolkits.mplot3d import Axes3Ddf = pd.DataFrame({    'x1': np.random.randint(-100, 100, 1000).astype(float),    'y1': np.random.randint(-80, 80, 1000).astype(float),    'z1': np.random.randint(-150, 150, 1000).astype(float),})scaler = preprocessing.Normalizer()scaled_df = scaler.fit_transform(df)scaled_df = pd.DataFrame(scaled_df, columns=df.columns)fig = plt.figure(figsize=(9, 5))ax1 = fig.add_subplot(121, projection='3d')ax2 = fig.add_subplot(122, projection='3d')ax1.scatter(df['x1'], df['y1'], df['z1'])ax2.scatter(scaled_df['x1'], scaled_df['y1'], scaled_df['z1'])plt.show()`

Note that the points are all brought within a sphere that is at most 1 away from the origin at any point. Also, the axes that were previously different scales are now all one scale.

Written by

## Towards AI

#### Towards AI, is the world’s fastest-growing AI community for learning, programming, building and implementing AI.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just \$5/month. Upgrade