Demystifying Statistical Analysis 2: The Independent t-Test Expressed in Linear Regression
Group comparison analyses such as the independent t-test and ANOVA may seem quite different from linear regression, but if we take a look at the cheat sheet in the first part of this series, we will notice that they actually fall under the same column of predicting a continuous dependent variable. The main difference is that t-tests and ANOVAs involve the use of categorical predictors, while linear regression involves the use of continuous predictors. When we start to recognise whether our data is categorical or continuous, selecting the correct statistical analysis becomes a lot more intuitive.
Categorical predictors (e.g. Male vs Female, Children vs Teens vs Adults, etc.) can be expressed in a linear regression using dummy or contrast codes. In fact, statistical packages such as SPSS automatically creates coded predictors in the background before running the appropriate statistical analysis, hence the term General Linear Model. In the next few parts of this series, I will attempt to illustrate how some of the popular group comparison analyses are represented in linear regression analysis, with the help of the textbook “Data Analysis: A Model Comparison Approach” by Carey Ryan, Charles M. Judd, and Gary H. McClelland. I hope that by drawing the connections between the various statistical analyses, it will become easier to identify when each statistical analysis should be used.
The independent t-test is one of the most commonly used statistical test to determine if there is any difference between 2 unrelated groups. For those who are unfamiliar with the test and want to know how it is usually conducted in SPSS, Laerd Statistics provides a comprehensive step-by-step guide. Otherwise, I will be explaining about the independent t-test using the following regression equation:
Ŷi = b0 + b1Xi
When comparing between 2 groups, there are 2 parameters in the regression model that need to be estimated: b0 and b1.
b0, commonly known as the intercept, estimates Ŷi when b1 is equal to 0. b1 is the estimated slope for the predictor Xi, which in this case represents the comparison between 2 groups such as “Male vs Female”. This comparison is done either by dummy coding (where Male is coded as 0 and Female is coded as 1) or by contrast coding (where Male is coded as -1 and Female is coded as 1).
When dummy coding is used, substituting Xi with 0 gives the Male mean, and substituting Xi with 1 gives the Female mean. Hence, in dummy coding, b0 always represents the Male mean (or rather the mean of the group being compared to), while b1 represents the difference between the means of Male vs Female.

Similarly, when contrast coding is used, substituting Xi with -1 gives the Male mean, and substituting Xi with 1 gives the Female mean. In this case, however, b0 represents the mean of the Male and Female means, while b1 represents 1/2 the difference between the means of Male vs Female (because -1 and 1 are 2 units apart).

Contrast coding provides a little more information than dummy coding, and avoids making comparisons to a fixed reference group. Nonetheless, the significance testing of the slope b1 provides the same information for both types of coding, i.e. whether or not the groups are statistically different. Essentially, a hypothesis test on the slope is asking whether or not the slope is 0. If the confidence interval of b1 captures the value of 0 (e.g. -0.6 to 0.3), then there is likely to be no difference between the two groups; conversely, if the confidence interval of b1 does not include 0 (e.g. 0.4 to 1.3), the groups are statistically different, and the p-value of b1 will also be less than .05.


Quite evidently, this linear regression approach is equivalent to the independent-samples t-test that most people are familiar with, and produces the same results. In the subsequent posts of this series, I will continue to explain about other statistical analyses using the same method of linear regression and dummy/contrast coding.
Originally published at: https://learncuriously.wordpress.com/2018/09/02/independent-t-test
