Pooled variance and its effective degree of freedom

a limitation of the Welch–Satterthwaite equation

10 min readMay 7, 2024

Conveyor belt in a beer factory (taken by author)

Descriptive statistics are often used to summarise and describe the characteristics of a data set obtained by measurement. Measures of central tendency (e.g. mean, median, mode) and measures of variability (e.g. variance, standard deviation, range) provide insight into the distribution and spread of data. These statistics help researchers, including metrologists, to understand the central values and degree of dispersion within a data set. However, the metrologist is more interested in a measurand's mean and uncertainty.
Suppose the measurement uncertainty is only affected by the randomness of the data. In that case, the uncertainty of the mean can be described by a Student's t-distribution with a degree of freedom given by the number of measurements. In a previous article of mine, I showed how the t-distribution can be generated numerically by random variables sampled from a population following a normal distribution. In some cases, however, the uncertainty of the variance becomes more important. In the same context, I have already pointed out that we live in an era where the variance or variability of data is increasingly important.

In this article, I will describe a case where the variance is important, how to reduce the uncertainty in the variance by using pooled variance, and how to calculate the effective degrees of freedom for the pooled variance. The Welch-Satterthwaite equation is widely used in uncertainty analysis. As an approximation, a limitation of the equation is illustrated by numerical simulation.

Cases where uncertainty of variance is important

Suppose you are a quality control manager in a manufacturing plant responsible for ensuring the consistency of the weight of a product. You receive batches of the product from different production lines, and your task is to determine whether the variance in weight across these batches is within acceptable limits. Let's say you have two scenarios:

Scenario 1: Low Variance, In this scenario, the variance of the weight of the product between batches is relatively low. This means that the weights of individual items within each batch are very similar. As a result, you can be confident that the manufacturing process is consistent and that the product meets quality standards. With low variance, you have less uncertainty about the weight of each item, which contributes to greater confidence in the consistency of the product.
Scenario 2: High variance. In contrast, in this scenario the variance of the product weight between batches is high. This indicates that there is considerable variability in the weights of individual items within each batch. This variability may be due to inconsistencies in the manufacturing process, equipment malfunctions or other factors. As a quality control manager, high variance raises concerns about the consistency and reliability of the product. You cannot be as confident in the quality of the product because of the increased uncertainty in the weight of each item.

In addition to the example above, the uncertainty of the variance is essential for a general understanding of other sectors:

Quality control: In industries such as manufacturing, healthcare, finance, etc., maintaining consistent quality is critical. By assessing the uncertainty of variance, managers can identify potential problems in production processes and take corrective action to improve quality and reduce variability.
Decision making: Uncertainty in variance affects decision making. For example, if you’re comparing two products or processes, knowing the uncertainty in their respective variances helps you determine whether any observed differences are statistically significant or simply due to chance.
Risk management: Variance uncertainty informs risk assessment and management strategies. For example, in financial markets, understanding the uncertainty of asset return variances is essential for estimating potential losses and effectively managing investment portfolios.
Research and development: In scientific research, understanding variance uncertainty ensures the reliability and reproducibility of experimental results. Researchers must take variance uncertainty into account when interpreting study results and drawing conclusions.

In short, variance uncertainty provides valuable insight into data or processes' reliability, consistency, and risk. It enables informed decision-making, quality improvement, and risk management and increases the credibility of research results.

How to reduce the uncertainty of variance

To increase measurement samples: It is evident that the more samples, the less uncertainty of the variance, as in the case of the uncertainty of the mean, which is too costly. We need to know how the variance is distributed to obtain a confidence interval of its uncertainty. Although it is already known that the sample variance (S²) normalised by the population variance follows a chi-square distribution with a degree of freedom (DoF, v = n-1):

I want to show it numerically using a piece of Python code for the reader's understanding. The samples are taken from the standard normal distribution (σ = 1). The histograms of the sample variances for different DoFs, multiplied by a DoF and normalised by σ², are plotted and compared with the analytical forms of the chi-square distributions.

import matplotlib.pyplot as plt
import numpy as np
import scipy.stats as st

np.random.seed(7)
dfs = np.arange(2, 11)           # degree of freedom (DoF)
N, bin = 1000000, 100
bins = np.linspace(0, 15, bin)

titles, legend = ['simulation', 'theory'], []
fig, ax = plt.subplots(1,2, figsize=(12, 4))

for df in dfs:
    # samples from normal distributions with different DoF
    rv = np.random.normal(size=(df+1)*N).reshape(df+1, N)
    # sample variance multiflied by DoF
    rvst = df*np.var(rv, axis=0, ddof=1)                         

    y, x=np.histogram(rvst, bins=bins,density=True)
    ax[0].plot(x[:-1], y)
    ax[1].plot(bins, st.chi2.pdf(bins,df))
    legend.append(f'\N{GREEK SMALL LETTER nu}={df}')
    [ax[i].legend(legend) for i in range(2)]
[ax[i].set_title (titles[i]) for i in range(2)]
plt.show()

Numerically generated chi-squared distributions with different degrees of freedom compared with their analytical forms.

To show the agreement, the histogram for DoF = 10 can be checked against the analytical equation using the code below:

dof, loc, scale = st.chi2.fit(rvst)
print(f'dof={dof:4.3f}, loc={loc:4.3f}, scale={scale:4.3f}')

dof=10.033, loc=-0.019, scale=0.998

To use the pooled variance method: the pooled or combined variance is an estimate of the variance of several different populations, where the mean of each population may be different, but the variance of each population can be assumed to be the same. Assuming equal population variances, the pooled sample variance provides a more accurate estimate of the variance than the individual sample variances. This increased precision can lead to a credible standard uncertainty when used in uncertainty assessment and leads to increased statistical power when used in statistical tests that compare populations, such as the t-test.

Statistics often involves collecting data for a dependent variable, y, over a range of values for the independent variable, x. For example, the length of a gauge block might be studied as a function of ambient temperature as the block's temperature reaches thermal equilibrium. If to obtain a small variance in y, numerous replicate tests are required at each value of x, the test may become too costly. Reasonable variance estimates can be obtained using the pooled variance principle after repeating each test at a given x only a few times.

The pooled variance estimates the fixed common variance 𝜎² underlying various populations with different means. We are given a set of sample variances sᵢ², where the populations are indexed 𝑖 = 1,…,𝑚.

Assuming uniform sample sizes, 𝑛ᵢ =𝑛, then the pooled variance 𝑠ₚ²can be computed by the arithmetic mean:

If the sample sizes are non-uniform, then the pooled variance 𝑠ₚ² can be computed by the weighted average, using vᵢ = 𝑛ᵢ−1, the respective degrees of freedom as weights:

It is known that the statistic of 𝑠ₚ²/ 𝜎² follows a reduced chi-square distribution with DoF = Σvᵢ.

As a demonstration, let's compute a toy problem of the pooled variance with 5 sets of y-data given in a piece of code below:

import statistics
# y-data for x[0], x[1], x[2], x[3], x[4]
y = [[31, 30, 29], [42, 41, 40, 39], [31, 28], [23, 22, 21, 19, 18], [21, 20, 19, 18,17]]

s2 = [f'{statistics.stdev(x)**2:6.4f}' for x in y]
print(f'sample variance = {s2}')

pv= sum([float(s2[i])*(len(y[i])-1) for i,_ in enumerate(y)])/sum([len(y[i])-1 for i, _ in enumerate(y)])
print (f'pooled variance ={pv:3.2f}')

sample variance = ['1.0000', '1.6667', '4.5000', '4.3000', '2.5000']
pooled variance =2.76

Welch–Satterthwaite(W-S) equation

This equation approximates the effective DoF of a linear combination of independent sample variances, known as the pooled DoF corresponding to the pooled variance. For n sample variances sᵢ² (i = 1, …, n), each with νᵢ DoF, we often need to compute the linear combination of the variances:

where kᵢ is a positive real number. The probability distribution of χ' cannot be expressed analytically. However, it is known that its distribution can be approximated by another chi-square distribution, whose effective DoF is given by the Welch–Satterthwaite equation:

In uncertainty analysis, we usually calculate this equation with kᵢ = 1, where each variance comes from the respective independent uncertainty component. In this case, the effective DoF becomes simpler:

Note that the W-S equation is an approximation. To numerically check the validity of this equation, I have calculated the pooled variance and the effective DoF of the variance samples using the W-S equation above. The samples are taken from the populations following the chi-square distribution. The random variables of the pooled variance are fitted to a chi-square distribution to obtain the distribution's DoF, location and scale. This DoF from the fit can be considered a numerical solution to the respective effective DoF, compared to the effective DoF from the W-S equation.

The code below calculates the pooled variance and the effective DoF for two and three independent variables with unit standard deviations and various DoFs. With N = 4000000, based on repeatability, its calculation uncertainty is around 1%, except for loc_fit. Note that in some cases, the W-S approximation shows more discrepancy than uncertainty, so care must be taken when applying it to cases where the effective DoF is important. Sometimes, it may be worth using this code to calculate the effective DoF numerically.

N = 4000000
def pooled_variance(dof):
    '''numerically calculate pooled_variance and 
    effective DoF by Welch–Satterthwaite equation'''
    rv = []
    [rv.append(st.chi2.rvs(df=dof[i], size=N)) for i, _ in enumerate(dof)]
    rv = np.array(rv)
    sum1 = np.zeros(N)
    sum2 = np.zeros(N)
    chi = np.zeros(N)

    for i, _ in enumerate(dof):
        sum1 = sum1 + rv[i]**2
        sum2 = sum2 + rv[i]**4/dof[i]
        chi = chi + dof[i]*rv[i]
    edof = np.mean(sum1**2/sum2)
    pooled_variance = chi/sum(dof)
    return pooled_variance, edof


from tabulate import tabulate
np.random.seed(2)
dofs_sample = [[[5,5], [5,10], [5,15], [10, 20]],
               [[5,5,5], [5,10,15], [10,10,15], [10,15,20]]]
bin = 200
bins_array = np.linspace(0, 40, bin)
fig, ax = plt.subplots(4,4, figsize=(16, 10),  gridspec_kw={'height_ratios': [4, 1, 4,1]})
plt.tight_layout()
table =[]

for j, _ in enumerate(dofs_sample):
    for i, _ in enumerate(dofs_sample[0]):
        rv, w_s = pooled_variance(dofs_sample[j][i])
        
        # plot histograms
        y, x=np.histogram(rv, bins=bins_array, density=True)
        ax[2*j][i].plot(x[0:-1], y)
        ax[2*j][i].set_title(f'DoF ={dofs_sample[j][i]}')

        # fit and plot residuals
        dof, loc, scale = st.chi2.fit(rv)
        ax[2*j+1][i].plot(x[0:-1],y-st.chi2.pdf(x[0:-1], dof, loc=loc, scale=scale))

        table.append([f'{dofs_sample[j][i]}', f'{w_s:3.1f}', 
                         f'{dof:3.1f}',f'{loc:3.2f}', f'{scale:3.2f}'])
print(tabulate(table, headers=["Samples_DoF", "DoF_W-S equaiton", \
                               "DoF_fit", "loc_fit", "scale_fit"]))

Samples_DoF      DoF_W-S equaiton    DoF_fit    loc_fit    scale_fit
-------------  ------------------  ---------  ---------  -----------
[5, 5]                        7.6       10         0            0.5
[5, 10]                      12.2       13.2       0.22         0.62
[5, 15]                      17.1       16.7       0.43         0.72
[10, 20]                     25.9       25.7       0.65         0.62
[5, 5, 5]                    10         15         0            0.33
[5, 10, 15]                  23.2       24.9       0.51         0.45
[10, 10, 15]                 27.2       31.5       0.38         0.37
[10, 15, 20]                 36.2       39.2       0.61         0.39

Histograms of the pooled variance of two (top row) and three (third row) independent variables with a set of degrees of freedom in the plot titles. The second and bottom rows show the residuals between the histograms and the chi-squared pdf given by the respective fits.

In the same context, my colleague Mark Ballico also wrote in Metrologia about a weakness of the W-S equation: "The W-S approximation is used to estimate an effective DoF for a probability distribution formed from several independent normal distributions for which only estimates of the variance are known. The ISO Guide to the Expression of Uncertainty in Measurement recommends its use for calculating expanded uncertainties (the ISO term for confidence intervals) for uncertainties formed from several distributions. Although it is recognised that this formula is an approximation, counter-intuitive results can be obtained when a variance estimate of the dominant distribution has a small number of DoFs; the calculated confidence limits are sometimes found to decrease as the contributing uncertainties increase".[3]

Summary

Although in some cases, such as quality control in manufacturing plants, the variance is more important, it is often overlooked than the mean. To reduce the uncertainty of the variance, we use the pooled variance method and the W-S equation to obtain the effective DoF of the variance. In addition, the W-S equation is essential in uncertainty analysis to get the coverage factor of uncertainty. As I had accepted the W-S equation without consideration, I was surprised to know that I could make a mistake in applying the equation. In cases where the effective DoF is important, it is necessary to check the validity of the W-S equation. We can use the Python code in this article to do this.

I was going to include the topic of Welch's t-test in this article. However, I am afraid to leave it until the next article as I want this article to be brief. Please be patient and wait a while.

Reference

Wikipedia, Welch–Satterthwaite equation
Wikipedia, Pooled variance
M Ballico1, Limitations of the Welch-Satterthwaite approximation for measurement uncertainty calculations, Metrologia, Volume 37, Number 1, DOI 10.1088/0026 — 1394/37/1/8