Measuring Variability and Spread — Range, Interquartile Range (IQR), Variance, Standard Deviation

Saurabh Dorle
Omni Data Science
Published in
3 min readMar 6, 2023

Variability and spread are important concepts in statistics that help to describe the amount of variation or dispersion in a data set. These measures provide information about the distribution of the data, which can be used to make inferences about the population from which the data were sampled. In this blog, we will discuss four measures of variability and spread: range, interquartile range, variance, and standard deviation. We will also provide examples to illustrate how these measures can be used in practice.

Range:

Range is a simple measure of variability that represents the difference between the largest and smallest values in a data set. It is easy to compute, but it has a limitation that it is sensitive to extreme values in the data set. That is, if a data set has one or more extreme values (outliers), the range may be very large, even if the rest of the values are tightly clustered.

Example: Suppose we have a data set of test scores from 10 students: 70, 72, 75, 78, 80, 82, 85, 88, 90, and 95. The range of these scores is 25 (95–70).

Interquartile Range:

Interquartile range (IQR) is a measure of variability that represents the spread of the middle 50% of the data. It is calculated as the difference between the 75th percentile (Q3) and the 25th percentile (Q1) of the data set. This measure is less sensitive to extreme values than the range and is a more robust measure of dispersion.

Example: Suppose we have a data set of the number of hours of sleep per night for 10 people: 5, 6, 6, 7, 7, 7, 8, 8, 9, and 10. The 25th percentile (Q1) is 6 and the 75th percentile (Q3) is 8. Therefore, the IQR is 2 (8–6).

Variance:

Variance is a measure of variability that represents the average of the squared differences of each value from the mean of the data set. It provides information about the spread of the data from the mean and is useful for describing the distribution of a population.

Example: Suppose we have a data set of the heights (in cm) of 10 people: 150, 155, 160, 165, 170, 175, 180, 185, 190, and 195. The mean of these heights is 170. The variance of these heights is:

((150–170)² + (155–170)² + … + (195–170)²) / 10 = 375

Standard Deviation:

Standard deviation is a measure of variability that represents the square root of the variance. It is useful because it has the same units as the data and is easily interpretable. A small standard deviation indicates that the data is tightly clustered around the mean, while a large standard deviation indicates that the data is more spread out.

Example: Continuing with the previous example, the standard deviation of the heights is:

sqrt(375) ≈ 19.36

Conclusion:

In summary, variability and spread are important concepts in statistics that provide information about the distribution of a data set. Range, interquartile range, variance, and standard deviation are four commonly used measures of variability and spread. Range is easy to calculate but sensitive to outliers, while interquartile range is more robust to outliers. Variance and standard deviation provide information about the spread of the data from the mean and are useful for describing the distribution of a population.

--

--