Central limit theorem-One Pager
This article could be considered as a gentle introduction for beginners or a quick refresher for experienced.
Inferential statistics with the availability of sample data making inferences on population data. Central Limit Theoram alias CLT is a process helps us to validate this assumption. Below are the properties of CLT:
- Sample mean (μ¯X) = Population mean (μ) as we dont or cannot calculate population mean
- Sampling distribution’s standard deviation (Standard error) = σ√n Sampling distribution is a different topic altogether. Assume a mean is arrived out of sample
- For n > 30, the sampling distribution becomes a normal distribution
In theory(or on reality) when the number of sample count is above 30 we get a normal distribution curve
Lets calculate CLTwith an example:
Say there 20000 employees working in an organisation. we would like to calculate the average commute time of the employees. Practically impossible to calculate for everyone so we can calculate it for small number of samples say for 100 employees and infer for the population.
The average commute time for these 100 employees is 35 minutes. we can assume the population mean should be close to 35 min which is
population mean = 35 + or - error value
This sample mean + or- minus error is called the confidence level. The formula for calculating the confidence level is
Sample mean = 35 (X bar)
And we got sampling distribution’s SD = 9 (S)
n = 100 (Total sample count)
we need Z score which is dependent on the confidence level defined by us. Say 95%
Since this is a 2 tail test = 0.95 + (1– 0.95)/2 =0.975
z score for .975 is 1.96
Applying these values in the formula we we get
35–1.74 , 35+1.74 which is
the average commute time of all employees will be between 33.26 and 36.74 minutes
Continue reading about Hypothesis Testing