Z-Distribution or Z-Score Application in Machine Learning

Data pre-processing concept

Amit Chauhan
The Pythoneers

--

In this article, the Z-score is a way to standardize the data to standard scale i.e. how far the data point is from the mean. The z-score can come positive or negative based on the help of mean and standard deviation values.

The data point away from the mean with some standard deviation is called a z-score.

The formula to get the z-score is shown below:

z = (data point - mean)/standard deviation 

The z-score can be perfectly found in a normal distribution curve with no left skew and right skew. The below image shows these curves.

Normal Distribution: The normal distribution is a curve in which the data is spread symmetrically on both sides of the mean.

Right Skew: The data is mostly skewed on the right side because most of the data is on the right side. If we talk about outliers they are mostly on the right side too.

Left Skew: The data is mostly skewed on the left side because most of the data is on the left side. If we talk about outliers they are mostly…

--

--