Paired two-sample t-test in Python

3 min readApr 3, 2022

Introduction

Paired/Dependent t-test examines whether a statistically significant difference exists between the means of underlying population of two dependent measurements .

To illustrate, the measurements are took from the same subject/object, so they are dependent. In other words, one measurement provides information about another, so we can infer the data based on one measurement. The most common example is the before/after analysis, which we’ll demonstrate in the example. In particular, we’ll use ttest_rel function in SciPy package to perform t-test.

Assumptions

For paired t-test, there are 4 assumptions.

Data needs to be numerical/continuous. Otherwise, the mean can’t even be calculated.
The subjects/objects are independent. Although the measurements are dependent since they are taken from the same subject/object, the subjects/objects themselves must be independent.
The differences between pairs are normally distributed. Note that we now focus on the differences between pairs, not the measurements themselves. It’s because we can conclude there is NO statistically significant difference in the means of two measurements if the mean of difference distribution is 0.
No extreme values in the difference distribution. Once again, we focus on the differences between pairs. Since mean takes all the differences into account, it will be seriously affected by outliers and the testing result might be biased.

Example

Say we randomly collect the midterm and final exam grade of 50 students (sample) and save the data in two arrays called mid_grades and final_grades. Our goal is to analyze whether there is a statistically significant difference between the means of midterm and final exam grade of students in the school (population).

The midterm and final exam grade are dependent since we collect the data from same subject (student). In specific, a student who has higher midterm grade tends to have higher final exam grade.

Now let’s make the hypothesis. Our null hypothesis is “There is no difference between the means of students’ grade in midterm and final exam.”, and the alternative hypothesis is “There is statistically significant difference between the means of students’ grade in midterm and final exam.”.

Before examining the hypothesis, we first look at the Difference distribution. Importantly, we want the difference distribution to be normally distributed and no extreme values, not the midterm or final exam grade distributions. Thus, we have to create a difference array.

# save difference in grades in an array
difference = final_grades - mid_grades

Then, we use histogram to inspect the distribution, and we use a vertical red line to represent the mean of differences.

# the settings of output plots
fig = plt.figure(figsize=(7,5))   
plt.hist(difference, bins=10) 
plt.axvline(x=np.mean(difference), c ='r')
plt.text(np.mean(difference)-2, 9, round(np.mean(difference),3), c='r')
plt.xlim(0, 20)
plt.ylim(0, 11)
plt.title('Histogram of Differences between midterm and final exam grade')
plt.xlabel("Differences")
plt.ylabel("Frequency")
plt.show()

We can tell that the difference distribution is roughly normal, so t-test is applicable. Next, the mean of differences is 10.173, which is far away from 0. Thus, we suspect there is difference between the means of students’ grade in midterm and final exam.

To further confirm our suspicion, we carry out a paired t-test using ttest_rel function.

result = stats.ttest_rel(mid_grades, final_grades)

We assign the test result to a variable called result. It’s an array-like object that stores both t-value and p-value. We only need to see p-value here (but you can see t-value by typing result.statistic).

result.pvalue

The p-value is 6.1307127709123454e-21, which is super close to 0. Therefore, our null hypothesis is rejected given significance level 0.05 (see how to interpret p-value), which is aligned with what we guess. To summarize, based on the evidence our sample dataset provides, we conclude that there is statistically significant difference between the means of students’ grade in midterm and final exam.

Coding

References

https://statisticsbyjim.com/basics/independent-dependent-samples/

Paired two-sample t-test in Python

Written by Little Dino