Measure of variability

Deependra Verma
5 min readMay 29, 2023

--

Photo by Pritesh Sudra on Unsplash

The terms “measure of variability” and “measure of dispersion” are used interchangeably in statistics. They both refer to the same concept, which is quantifying the spread or scattering of data points in a dataset.

Measures of variability or dispersion provide information about how the data points are distributed around a central tendency measure (such as the mean, median, or mode) and give an indication of the spread or extent to which the values deviate from the central value. And because of this, the dispersion is also known as a scatter, spread, or, variation.

So, a good central value means dispersion is at its minimum, and that is why dispersion tells us whether a series is stable or not.

One of the very common real-life examples to which we can all relate is weather variability. Measures of variability can be used to understand how temperatures vary throughout the year in a particular city. For example, the range of temperatures between the hottest and coldest days can give you an idea of how much the weather fluctuates. A higher range would indicate more variability in the temperatures, while a lower range would suggest more consistent weather.

For comparing two or more series, we have a few metrics that tell us which series is less diverse and which series has more variation. There are two broad classifications of measures of dispersion:

a. Absolute Measure (Unit same)

  1. Range
  2. Quartile Deviation
  3. Mean Deviation
  4. Standard Deviation
  5. Variance

b. Relative Measure (Unit Free)

  1. Coefficient of Range
  2. Coefficient of Quartile Deviation
  3. Coefficient of Mean Deviation
  4. Coefficient of Standard Deviation
  5. Coefficient of Variation

If all the series in comparison are in the same unit, we will use absolute measure, but if any of the series are in a different unit, we will use the coefficient of measure or relative measure.

Let us discuss all the metrics one by one:

  1. Range/Coefficient of Range —
Example of a Range

The range is a very simple metric; it is just the difference between the largest and smallest observation or value.

Range (R) = Largest Observation (L) — Smallest Observation (S)

Coefficient of range =

Coefficient of range

In the case of any coefficient calculation, we will use a percentage instead of a coefficient value.

a. Highly sensitive to outliers

b. Only consider the extreme values and ignore the distribution of data.

c. inadequate for small sample sizes

2. Quartile Deviation/Coefficient of Quartile Deviation:

Quartiles are statistical measures that divide a dataset into four equal parts. They provide insights into the distribution of the data and help identify the central tendency and spread of the values.

The three commonly used quartiles are the 25th, 50th (also known as the median), and 75th quartiles.

The 25th quartile (Q1), also called the first quartile or lower quartile, divides the data into the bottom 25% and the top 75%. It represents the value below which 25% of the data points fall.

The 50th quartile (Q2), also known as the second quartile or median, divides the data into two equal halves. It represents the middle value of the dataset, where 50% of the data points are below and 50% are above.

The 75th quartile (Q3), also referred to as the third quartile or upper quartile, divides the data into the bottom 75% and the top 25%. It represents the value below which 75% of the data points fall.

Quartile Deviation (QD) =

Quartile Deviation

Coefficient of Quartile Deviation =

Coefficient of Quartile Deviation

To find each quartile in individual and discrete series, we have to arrange the series in ascending order and then follow the formula:

Quartle Calculation

If any discrete series is arranged in ascending order, Q2 will be the median.

3. Variance/Coefficient of Variation:

Variation of quality of apple from the mean(Quality)

It is defined as the average of the sum of the squared distance of any data point and the mean of the dataset. Or, simply, we can say it is the dispersion around the mean.

Variance for Population
Variance for the Sample

Note: Units of value and variance are not the same. So, another variability measure can be used, which is called “standard deviation.”.

4. Standard Deviation/Coefficient of Standard Deviation:

It is the statistical term used to measure the amount of variability or dispersion around the average of the data. It is nothing but the square root of variance. It depicts the concentration of data around the mean of the dataset.

Standard Deviation for Population
Mean and Standard Deviation

So the standard deviation tells us how far the data point is from the mean, and we can see that from the above image. In the case of the normally distributed dataset, the addition or subtraction of the standard deviation from the mean has some standardized probability values, which are shown in the above image.

Conclusion:
In conclusion, measures of variability, also known as measures of dispersion, quantify the spread or scattering of data points in a dataset. They help us understand the extent to which values deviate from a central tendency measure. Whether using absolute measures like range, quartile deviation, mean deviation, standard deviation, and variance, or relative measures like coefficients, these metrics provide valuable insights into the stability and variation of data. By examining dispersion, we can unlock the secrets hidden within datasets and make informed conclusions about their characteristics.

--

--