Statistics: Range, Variance and Standard Deviation
Measure of Variability/Dispersion : Range, Variance and Standard Deviation, Why the Denominator (n-1) and numerator is Squared in Variance???
Introduction
In this blog, we will understand the concepts of
- Population & Sample
- What is Measure of Variability/Dispersion???
- Range
- Variance for Population & Sample
- Why the numerator is Squared in Variance???
- Why the Denominator (n-1) in Sample Variance???
- Standard Deviation for Population and Sample
Let’s get into Action . . . . . . .
Population & Sample
Population : The Population is the Entire group that you are taking for analysis or prediction.
Sample : Sample is the Subset of the Population(i.e. Taking random samples from the population). The size of the sample is always less than the total size of the population.
Better you understand the Population and Sample, Parameter and Statistic, Biased and Unbiased concepts clearly then read this blog . . .
Measure of Dispersion/Variability
A Measure of variability is one of the Descriptive Statistic that represents amount of dispersion in a dataset. In Measure of Central Tendency describes the typical value, Measure of variability defines how far away the data points tend to fall from the center.
There are three ways to find the Measure of Dispersion. . .
- Range
- Variance
- Standard Deviation
Range
Range is the difference between the largest and smallest values in a dataset. It is one of the method in Measures of Dispersion/Variability.
Python Script
# Sample data
data = {4, 6, 9, 3, 7}
range = max(data) - min(data)print("Maximum Value : ", max(data))
print("Minimum Value : ", min(data))print("Range : ", range)"""
Output
>>>>Maximum Value : 9
>>>>Minimum Value : 3
>>>>Range : 6
"""
The range can sometimes be misleading when there are extremely high or low values.
Example: In {8, 11, 5, 9, 7, 6, 2500}:
- the lowest value is 5,
- and the highest is 2500,
So the range is 2500 − 5 = 2495.
So we may be better off using Interquartile Range or Standard Deviation
Variance
Variance is one of the Measure of dispersion/variability. It gives, how the data points varied from the Measure of Central Tendency.
Population Variance
Finding the Variance for the Population data is known as Population Variance
Sample Variance
Finding the Variance to the Sample data is known as Sample Variance.
Variance : Python Implementation
Why the numerator is Squared in Variance???
Because, if you didn’t Square the Terms, the opposite signs of (+ve and -ve) values cancel each other and hence it tends to zero. In order to avoid this, we are squaring the values and hence the values becomes (+ve).
Example
Wait . . . Wait . . . Have you noticed Sample Variance Formula??? there is a slight changes in the denominator right when compared to Population variance. . .
Why the denominator (n-1) in Sample Variance?
There are two perspectives. . .
- Bessel Correction
Sample Statistic underestimates the population parameter due to samples(Sample mean change as we increase/decrease the sample size) and biased(tilt towards one side of the data). In order to reduce the bias in estimating the population variance, we use (n-1) in denominator.
2 . Degree of Freedom
If we know the Sample Mean, we can calculate the another data points using sample mean. When it comes to population, each and every data points gives independent and unchanged mean.
Degree of Freedom says that, the minimum number of data points/samples required to calculate the statistic. So, according to this point (If we know the Sample Mean, we can calculate the another data points using sample mean), we are reducing our denominator to (n-1)
Standard Deviation
Standard Deviation denotes “How the data points deviates from the Measure of Central Tendency”. The Square root of Variance is Standard Deviation.
Population Standard Deviation
Finding the Std. Dev for Population data is known as Population Standard Deviation
Sample Standard Deviation
Finding the Std. Dev for Sample data is known as Sample Standard Deviation
Standard Deviation: Python Implementation
Conclusion
I hope this article will help you to know about Measure of Variability: Range, Variance and Standard Deviation” and Population & Sample with example python script.
Like . . If you like. . .
Thank you,