Central Limit Theorem Simplified!

Seema Singh
3 min readMay 23, 2018

--

Central Limit Theorem (CLT) is very fundamental and a key concept in probability theory . It says that the statistical and probabilistic methods that work for normal distribution can also be applied to many other problems which deal with different types of distributions. This blog will explain what Central Limit Theorem is, why it is so important and how it solves the problems which does not deal with normal distribution.

Central Limit theorem

CLT Statement:

For large sample sizes, the sampling distribution of means will approximate to normal distribution even if the population distribution is not normal.

If we have a population with mean μ and standard deviation σ and take large random samples (n ≥ 30) from the population with replacement, then the distribution of the sample means will be approximately normally distributed.

For the random samples taken from the population,

Mean of the sample means is computed as

And the standard deviation of sample means

Why n ≥ 30 samples?

A sample of size 30 is considered large enough to see the effect and power of Central Limit Theorem. The closer the population distribution to normal distribution, fewer samples needed to demonstrate Central Limit Theorem. If the distribution of the population is highly skewed then we need large number of samples to demonstrate and understand Central Limit Theorem.

Demonstrating Central Limit theorem with n=5 & 10

Understanding with an example:

Above picture, shows 3 different population distributions which are not normal. Sampling distribution of means gets a little closer to normal distribution when we take n =5 and almost normal distribution when n=30.

Why CLT is important?

The field of statistics is based on fact that it is highly impossible to collect the data of entire population. Instead of doing that we can gather a subset of data from a population and use the statistics of that sample to draw conclusions about the population.

In practice, the unexpected appearance of normal distribution from a population distribution is skewed (even heavily skewed). Many practices in statistics such as hypothesis testing make this assumption that the population on which they work is normally distributed.

This assumption basically simplifies matters and we overcome the problem of the data belonging to the population which is not normal.

Thus, even we don’t know the shape our distribution where our data comes from but according to Central Limit Theorem we can treat sampling distribution of any population as normal. Ofcourse, for the conclusions of the Central Limit Theorem to hold we need sample size to be large enough.

Thank you for reading!

--

--