Understanding the Central Limit Theorem

Chidambara
Nov 3 · 2 min read

Central Limit Theorem is one of the foundational concepts in Statistics. In this post we will understand with a practical example

For accessing the code in this example, please refer to https://github.com/chidamnat/practicalDS/blob/master/mastery/central_limit_theorem.ipynb

Let us say that we are interested in finding the mean housing prices in Ontario province.

Ontario housing prices data

Drilling down to understand the population distribution

Population distribution (skewed to the right)

Population seems to be extremely skewed to the right and quite different from a textbook normal distribution. Suppose we want to learn the mean price of the housing population and obviously we can’t collect the data of the entire population as it may turn to be a costlier affair in most of the cases. This is where CLT comes to our rescue.

The Central Limit Theorem states that regardless of the underlying population distribution, the probability distribution of the sum / mean of the large sample drawn from the population tends to be normally distributed

Thus we can estimate the population’s mean without knowing the complete population and by only constructing a distribution of means from large number of samples drawn from the underlying population. This seems to approximate the population’s mean very well.

How closely it represents ? This is where the standard error or the standard deviation of the sampling distribution comes into the picture. Larger the sample size, lower the standard error. However the improvements to the estimation is not linear but it is given by the below formula

SD of the means of the samples = population SD / sqrt(n)

The below code with increasing the sample size to create a distribution of sample means seems to approximate the population mean pretty closely and with increase in sample size for each experiment, leading to reduced sampling error (SD of distribution of the sample means)

Sample size vs Standard Error

Chidambara

Written by

ML and Data science Engineer

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade