The Central Limit Theorem

Atiar Rahman
3 min readNov 13, 2019

The central limit theorem is one of the most important concepts in all of statistics. Have you ever wondered why we can get information on a population without surveying the entire population? Let’s consider an example. Imagine there are 1,000,000 third graders in a city and we need to estimate the average score on a statewide exam. It would take a long time to count a million test scores, so how do we get an estimation that is as close as possible to the actual population mean.

So you start thinking and the first thing you do is get a sample of 50 students and get an average of those test scores. Maybe this will give you some idea of what the mean will be. But 50 students isn’t even close to a million. So what do you do? Do you just get as large of a sample as possible and hope that it’s an accurate representation of the population? Clearly this is a tricky problem. But luckily we have the central limit theorem.

The central limit theorem states:

If repeated random samples of size n are taken from a population with a mean and standard deviation, the sampling distribution of sample means will have a mean equal to the population mean and a stand error equal to the standard deviation divided by the square root of n.

Now what does this mean? What is a sampling distribution of sample means and what is the standard error? Let’s go back to the example above. Imagine instead of taking one sample, you took an n amount of samples. Now every time, you took a sample you found the average mean of that sample. You then use these means to create a distribution. This distribution of sample means will in fact form a normal curve and the mean of this curve is equal to the mean of the population! The standard deviation of this distribution of samples is called the standard error, and it is equal to what is mentioned above.

More Specifics

So looking at this formula, there are some obvious questions one would have? How many samples need to be taken and how large is each sample? You generally want as many samples as possible. Generally, having sample sizes greater than or equal to 30 will suffice.

Conclusion

The Central Limit Theorem is essential to understanding inferential statistics. It is how statisticians can use samples to understand a population. Some important things to note is that the samples must be independent and random.

Resources

Here are some resources that might be helpful to get an even better understanding of the central limit theorem.

--

--