We live in an era where the variance or variability of data is increasingly important.

MCMC Addict
5 min readDec 12, 2023

--

Photo by Pritesh Sudra on Unsplash

Prologue

We try to quantify observations of an object or phenomenon by assigning numbers. In general, things are not easily explained by a single variable. Even if a single variable is sufficient to describe it, repeated observations may not yield the same value. The variability is caused by imperfections in our observation and by the intrinsic nature of the objects under study. These imperfections are introduced by any instrument used in observation, even in this age of automated data collection. They are caused by random and systematic effects that could be neglected by selecting more precise instruments and carefully calibrating them before use. Nevertheless, the intrinsic characteristics of the data scatter remain in the object, resulting in a probability distribution with a particular variance or standard deviation value.

We always have two representative values for a population of an object or phenomenon: the mean and the variance. The mean is a helpful indicator of the central tendency of the population. For example, a stock market index is a weighted average that sometimes makes investors happy and sometimes makes them unhappy. The difference in the average weight of a particular growing age group between the two countries can give us an idea of the differences in diet. Average blood pressure allows doctors to diagnose hypertension or normal blood pressure, and a city's average temperature in a given month helps travellers prepare clothes. Until now, the mean has been sufficient to describe a population's behaviour and compare two groups properly.

Variance and quality of human life

We live in a world where our main concern is quality of life, which can be supported by a safe and healthy life, a clean environment, security from future uncertainties, and so on. We could even quantify the quality of life by introducing an equation with several factors as independent variables. The mean of the quantity allows us to monitor progress and give positive feedback. If an increase in the variance accompanies the increase in the mean, can we be satisfied with that? We are sure that the greater the variance, the greater the inequality between individuals. The more we focus on the quality of life, the more critical the data variance becomes. I will describe some examples to argue that we live in such an era.

Global climate change

Climate change is not just about average temperature increases. Scientific studies indicate that extreme weather events, such as heat waves and major storms, are likely to become more frequent or intense with human-induced climate change. The paper below shows the contribution of climate change to the mean and variability of both temperature and precipitation. Increased variability leads to increased weather uncertainty, which adds to our anxiety about the future.

A stock market index to reduce financial anxiety

Volatility, or the rate at which prices change, is often used as a measure to explain market sentiment and, in particular, the level of fear among market participants. One example is the CBOE Volatility Index (VIX), which represents the market’s expectations of the relative strength of short-term price changes in the S&P 500 Index. There are two common ways to measure volatility. One method, which the VIX does not use, relies on historical volatility, which uses statistics on past prices over a period of time. These statistics calculate representative numbers such as the mean, variance, and finally the standard deviation of the historical data sets. The Votality Index has historically been a harbinger prior to the financial crisis, as shown in the chart below.

from https://www.thirdway.org/report/the-vix-measuring-uncertainty-in-financial-markets (Ref. 2)

Diagnosing a disease for a healthy life

Good health is maintained through regular check-ups and is essential for a high quality of life. Because the results of these tests vary widely from person to person, health care organizations recommend a range of test results as the normal. The range is derived from the distribution of the results in a large population. In general, if the distribution is not highly skewed, the distribution can be characterized by the variance of the distributions. For an example, according to the WHO diagnostic criteria for diabetes, a fasting blood glucose level of less than 110 mg/dL is considered as normal. Below are the distributions of the glucose concentrations for a given two populations for one week after admission to the ICU of a hospital. The diabetic group is more likely to be admitted to the ICU than the normal group. The distribution of the diabetic group is more skewed toward the higher concentration. The distribution of the normal group could be used to establish or improve a range of the normal.

Distributions of blood glucose for all patients(n=19,694) in the first week of ICU admission. Diabetic patients are shown in orange, and non-diabetic patients are shown in green. The band covering the peaks marks the target glucose region between 180 and 80 mg/mL. (Ref. 3)

Determining a safety factor for safe environments

Stress and strength are important material properties that engineers must understand in order to design and construct a safe structure. Stress is a measure of how much force is applied to an object, and strength is the ability to withstand the stress. If the stress exceeds the strength of a part, it will fail. The stress and strength of a given material model have a specification in the form of a probability distribution, as shown in the following figure. The probability of failure is determined by the area where the stress and strength distributions overlap. Roughly speaking, a safety factor is a ratio of the mean of the strength to the stress. If the variance of each distribution is smaller, the small safety factor is sufficient to maintain a target safety factor because the probability of failure is reduced. With a target failure probability, a given value of the safety factor could be increased simply by reducing the variances of the stress and the strength. They are also important in the development of new materials in order to ensure the quality of life in terms of safety. A statistic like the variance of materials allows us to maintain strong enough structures to allay our fears.

from https://reliability.readthedocs.io/en/latest/index.html

Epilogue

I have presented examples where the variance in statistics is essential: it ranges from global climate change to a financial market, health care, and material security. All of these areas are closely related to the quality of human life. Any fear of the uncertainty of a circumstance and the future comes from a significant variance of a given quantity that expresses it. We certainly live in an era in which the variance or variability of data is becoming increasingly important.

--

--