The Role of The t-Student Distribution
Introduction
Have you ever been one of those students who once had a teacher who told you something like “in this statistical test we use the t-distribution” and you never received an explanation better than “because there is additional uncertainty, we need a distribution with thicker tails than the normal”?
Well, in this short article you will see a straightforward explanation of why we actually need this distribution.
Distribution of The Sample Mean
The sample mean is a very common statistic that most of us have used in our analysis. This estimator, like all the estimators that use sample data to be computed, follows a certain distribution. As most of you already know, the Central Limit Theorem, also known as CLT, says (in simple words) that the standardized version of the sample mean follows a standard normal distribution as long as the sample size is large enough. Let X denote a random variable and X bar denote the sample mean. Remember that the variance of the sample mean is
where sigma squared is the population variance of X and n is the sample size. Therefore, if n is large enough, the CLT claims that
where mu is the population mean.
Unknown population variance
At this exact point is probably where your teacher told you “given that the population variance, sigma squared, has to be estimated, we need to use a distribution that accounts for this extra source of uncertainty and this distribution is the t-student, which is like a normal distribution, but with thicker tails”. This makes sense intuitively, right? In most cases the population variance is unknown and has to be estimated as we do with the population mean. But, where does the t-student distribution come from?
Definition of t-student distribution
Let’s start by knowing the mathematical definition of a t-student distribution. Well, a t-student distribution is obtained by dividing a standard normal random variable by the square root root of a random variable that follows a chi-squared distribution with n degrees of freedom divided by n. That is,
where a chi-squared distributed random variable with n degrees of freedom is nothing more than the sum of n squared standard normal random variables; that is,
I know this definition of t-student seems nonsense, but remember, it is just a definition. In other words, it is as simple as giving a name to a mathematical expression! The resulting distribution is called t-student with n degrees of freedom.
When Do We Use The t-Student Distribution?
We have seen earlier that the standardized sample mean when using the population standard deviation, follows a standard normal when n is large enough. But, how does the distribution of this statistic change when the population variance is estimated?
That is, let s be the sample standard deviation (square root of the sample variance); then, what is the distribution of the following statistic?
First, keep in mind the definition of a t-student distribution given earlier. Then, let’s try to make the previous ratio have a numerator that is a standard normal (like the definition of t-student requires). We can simply achieve that by dividing the previous numerator by the variance of the sample mean and apply the CLT to claim that the numerator follows a standard normal distribution. Note that the denominator also has to be divided by the same quantity in order for the ratio to remain unchanged. That is,
Alright, we are done with the numerator. Now we only have to figure out the denominator, by introducing U in it (as the definition of t-student requires). To do that we need to use the following statistical result, which is mathematically proven in the appendix of this article:
If we set the previous expression to be U, look what is found…
Yes, exactly. That is the expression we are looking for! More concretely, we found an expression of the denominator of the t-student definition! Hence, it can be claimed that,
Congratulations! Once and for all, you can prove why we use the t-distribution instead of the normal in many applications of the popular Wald statistical test.
— — — — — — — — — — — — — END — — — — — — — — — — — — — — —
Appendix
Let’s prove that
It is as simple as expanding the definition of chi-squared showed earlier (sum of squared standard normal random variables that are independent from each other):
Note that the left hand side of the equation is a random variable that follows a chi-square distribution with n degrees of freedom and on the right hand side there are two terms, one of which follows a chi-squared distribution with one degree of freedom. Therefore, in order for the equality to hold the left term from the right hand side of the equation above has to follow a chi-square distribution with n-1 degrees of freedom (note that the sum of chi-squared random variables leads to a chi-squared random variable with degrees of freedom equal to the sum of all the degrees of freedom).