Analytics Vidhya
Published in

Analytics Vidhya

Normal distribution

The normal distribution is a probability function that defines how the values of a variable are distributed. The normal distribution is a symmetric distribution where most of the observations cluster around the central peak and the probabilities for values further away from the mean taper off equally in both directions.

Normal distribution also called the bell curve or Gaussian distribution. Normal distribution is bell shaped and symmetric about a vertical line through its center. Mean, median and mode are all equal and located at the center of the distribution.

In below image we can see values are distributed along x and y axis using histogram (Shown in yellow) and bell curve shown the normal distribution along the histogram.

Many business and institutes follow this type of pattern. That’s why it’s widely used in statistics and in different sectors of business there are few examples given below.

· Heights of people.

· Measurement errors.

· Blood pressure.

· Points on a test.

· IQ scores.

· Salaries.

· size of things produced by machines.

· Marks on a test.

· Rolling a dice

· Tossing a coin

· Shoe size of a population

Properties of a normal distribution:

1. The curve is symmetric at the center (i.e., around the mean, μ).

2. Exactly half of the values are to the left of center and exactly half the values are to the right.

3. The total area under the curve is 1.

4. The Median, mean and mode are all equal.

The single most important distribution in statistics is the normal distribution.

· It is a continuous distribution and is the basis of the familiar symmetric bell-shaped curve.

· Any particular normal distribution is specified by its mean and standard deviation.

· By changing the mean, the normal curve shifts to the right or left.

· By changing the standard deviation, the curve becomes more or less spread out.

· The normal distribution is a two-parameter family, where the two parameters are the mean and standard deviation.

Parameters of Normal Distribution:

The two main parameters of a Normal distribution are the mean and standard deviation. The parameters determine the shape and probabilities of the distribution. The shape of the distribution over axis changes as the parameter values change.

1. Mean:

· The mean is basically used to measure of central tendency.

· Mean used to describe the distribution of variables measured as ratios or intervals.

· In a normal distribution graph, the mean defines the location of the peak, and most of the data points are clustered around the mean.

· Any changes made to the value of the mean move the curve either to the left or right along the X-axis.

2. Standard Deviations:

· The Standard Deviation is a measure of how spread out numbers are.

· The standard deviation determines the width of the curve, and it tightens or expands the width of the distribution along the x-axis.

· A smaller standard deviation indicates that the data is tightly clustered around the mean and the normal distribution will be taller.

· A larger standard deviation indicates that the data is spread out around the mean; the normal distribution will be flatter and wider.

There are few rules which describes the standard deviation one of the is empirical rule which tells us what percentage of our data falls within a certain number of standard deviations from the mean:

• 68% of the data falls within one standard deviation of the mean.

• 95% of the data falls within two standard deviations of the mean.

  • 99.7% of the data falls within three standard deviations of the mean.

Characteristic of Normal distribution:

· Symmetric:

A normal distribution has a perfectly symmetrical shape. The distribution curve can be divided in the middle to produce two equal halves. The symmetric shape occurs when one-half of the observations fall on each side of the curve.

· The mean, median, and mode are equal:

The middle point of a normal distribution is the point with the maximum distributions, which means that it possesses the most observations of the variable. The midpoint is also the point where these three measures fall. The measures are usually equal in a perfectly (normal) distribution.

· Empirical rule:

In normally distributed variables data, there is a constant proportion of distance lying under the curve between the mean and specific number of standard deviations from the mean. As explained and demonstrated above in standard deviation.

· Skewness and kurtosis:

Skewness and kurtosis are coefficients that measure how different a distribution is from a normal distribution. Skewness measures the symmetry of a normal distribution while kurtosis measures the thickness of the tail ends relative to the tails of a normal distribution.

Methods of solving Normal Distribution related problems: -

All normal distributions, like the standard normal distribution, are unimodal and symmetrically distributed with a bell-shaped curve. However, a normal distribution can take on any value as its mean and standard deviation. In the standard normal distribution, the mean and standard deviation are always fixed.

Every normal distribution is a version of the standard normal distribution that’s been stretched or squeezed and moved horizontally right or left.

The mean determines where the curve is centered. Increasing the mean moves the curve right, while decreasing it moves the curve left. The standard deviation stretches or squeezes the curve. A small standard deviation results in a narrow curve, while a large standard deviation leads to a wide curve.

How to Standardize a normal distribution:

When we standardize a normal distribution, the mean becomes 0 and the standard deviation becomes 1. This allows you to easily calculate the probability of certain values occurring in your distribution, or to compare data sets with different means and standard deviations.

For different values of x and mean Z-score will be changing.

· A positive z-score means that your x-value is greater than the mean.

· A negative z-score means that your x-value is less than the mean.

· A z-score of zero means that your x-value is equal to the mean.

Let’s check how to calculate Z-score and probability after that.

Steps: 1) Subtract the mean from your individual value.

2) Divide the difference by the standard deviation.

We can apply this in derive the formula as given below:

Once you have a z-score, you can look up the corresponding probability in a z-table. Depending on the z value we can refer to Z positive or negative table.

To find probability we have to refer Z table which is given below.

The first column of a z-table contains the z-score up to the first decimal place. The top row of the table gives the second decimal place.

Let’s assume we have to find probability for 0.25 z value. As shown in image we can travel around the table to identify the values.

P(z) = P(0.25) = 0.5987

If we have to identify the remaining value, we can directly subtract P(z) from 1 as probability of whole distribution is 1.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store