Demystifying Statistical Analysis 8: Pre-Post Analysis in 3 Ways

YS Chng
DataSeries
Published in
9 min readJun 18, 2021
Photo by Luke Chesser on Unsplash

It’s been half a year since I last blogged! I’ve been busy with a new role in the last couple of months, and in this new job, I have the privilege to work with economists who opened up my eyes to other types of econometric analyses that social scientists are less familiar with. Hence, I decided to write this article, compiling a few different ways of conducting pre-post analysis, one of which is an econometric practice.

But before diving into the methods, I should probably first explain what I mean by pre-post analysis.

What is Pre-Post Analysis?

Pre-post analysis is conducted when one is interested to find out if there is a difference in observations before and after an intervention, which will suggest whether the intervention had an effect or not. The intervention can be anything from administering a Covid-19 vaccine, to promoting a marketing campaign. Because of its ability to inform decision-makers whether an intervention is worth pursuing, pre-post analysis is ubiquitous in almost every industry, although the methods used to conduct it may vary.

Source: https://www.irinablok.com/covidlife-cartoons-beforeafter

The following is an overview of the methods that I’ll be covering, some of which have been discussed before in previous parts of this series:

  1. Repeated Measures ANOVA
  2. ANCOVA
  3. Difference-in-Difference

1. Repeated Measures ANOVA

When it comes to conducting pre-post analysis, Repeated Measures ANOVA is probably the default method taught to all social science students. If repeated measures ANOVA sounds unfamiliar to you, fret not — it is actually just an extension of the more commonly known paired t-test, aka dependent t-test.

The paired t-test is the t-test used when the comparisons being made are within the same group. For example, if you only had a treatment group of participants who are all receiving a drug to boost their intelligence, you would run a paired t-test to compare their test scores before and after receiving the drug.

The Supercharged Paired t-Test

However, the paired t-test can only be used for comparisons between two time-points, and it is not able to account for a control group. Enter the “Repeated Measures ANOVA”, which is like a supercharged paired t-test, that is not only able to make comparisons across multiple time-points, but also allows for comparison against a control group, or in fact any number of groups just like a normal ANOVA. Hence the name, “Repeated Measures” and “ANOVA”.

Repeated Measures ANOVA — The Supercharged Paired t-Test.

Based on the earlier example of taking a drug to boost intelligence, repeated measures ANOVA not only allows you to compare participants’ test scores before and after receiving the drug, but also compare test scores against a control group receiving a placebo. If the control group sees an increase in test scores similar to the treatment group, then perhaps the drug to boost intelligence is less effective than imagined.

Tests for Within-Group Comparisons

If this still sounds confusing, the only thing you need to remember is that the comparisons for pre-post analysis are within-group, and the only appropriate tests to use for within-group comparisons are paired t-test and repeated measures ANOVA. However, repeated measure ANOVA has an advantage over paired t-test in allowing the inclusion of a control group. Simple as that.

To learn more about how these different tests are related to one another, please refer to this statistical analysis cheat sheet that I created previously.

2. ANCOVA

I have actually shared about ANCOVA in a previous post before, but not in the context of using it for pre-post analysis, so I will explain it in this post.

ANCOVA (Analysis of Covariance) is an extension of ANOVA where covariates can be freely added to the analysis, to test the hypothesis on whether their inclusion affects the statistical difference between the treatment vs control groups.

Using the example of taking a drug to boost intelligence, the primary analysis is a comparison of test scores between the treatment vs control, after they receive the drug and placebo respectively. But in this method, the test scores before receiving the drug/placebo will also be added to this analysis as a covariate.

Representing this analysis in a linear regression (I explained how to do this in my previous post):

Ŷi = b0 + b1Xi + b2Zi

Where,

  • Ŷi is the test scores after receiving the drug/placebo
  • Xi is categorically coded by treatment vs control, which is the primary analysis
  • Zi is the test scores before receiving the drug/placebo, which is the covariate
Sample data for running ANCOVA.

Interpreting the Analysis

The statistical test on the b1 tells us whether the treatment and control groups are statistically different, while the statistical test on the b2 tells us whether test scores after receiving the drug/placebo are predicted by test scores before receiving the drug/placebo. In other words, the statistical test on the coefficient of the covariate tells us whether there is an effect of time on the groups as a whole.

What we want to look for in the analysis is a statistical significance in b1, even after test scores before receiving the drug/placebo have been included as a covariate. Ideally, b2 should not be statistically significant, but the beauty of this is that even if b2 is significant, all that matters is that the F-statistic of the overall regression model is still significant even with the inclusion of Zi as a covariate.

3. Difference-in-Difference

Difference-in-Difference analysis (aka DiD) is the new type of analysis that I learnt from my economist colleagues. I would like to caveat that what I know is probably at a very superficial level, but what I aim to do is to explain its essence in the simplest way possible. If I do make any mistake, please pardon me and feel free to point it out in the comments below.

So as it turns out, DiD is less commonly heard of among social scientists like psychologists, because it is usually taught to economists in econometrics. But if you are familiar with expressing comparison analyses such as t-tests or factorial ANOVA in a linear regression (explained in a previous post), then the DiD becomes a lot less intimidating.

Graphical explanation of DiD
(Source: https://www.publichealth.columbia.edu/research/population-health-methods/difference-difference-estimation)

Using the same example of taking a drug to boost intelligence, let’s first take look at the regression equation of the DiD method:

Ŷi = b0 + b1Xi + b2Ti + b3XiTi

Where,

  • Ŷi is the test scores for both before and after receiving the drug/placebo
  • Xi is categorically coded by 1 for treatment vs 0 for control
  • Ti is categorically coded by 1 for post-intervention vs 0 for pre-intervention
  • XiTi is the interaction variable computed by multiplying Xi with Ti

Economists may use other fancy mathematical notations to represent their variables or coefficients, but if you look closely at the core of the regression structure, it is essentially these components. In fact, if you compare this regression structure to the one in a 2×2 factorial ANOVA, they are actually the same! One might then ask, how is DiD different from factorial ANOVA, or even repeated measures ANOVA?

Dummy Coding in DiD

First of all, to understand how DiD is different, you’ll need to know that the type of coding used in DiD is strictly dummy coding. This is very important. If you followed my series on how to express statistical analysis in linear regression, you’ll note that the type of coding I used was contrast coding. Depending on the type of coding used, the interpretation of the coefficients can be quite different (explained in a previous post).

Sample data for running DiD.

Because DiD uses dummy coding with 1 representing the treatment group and post-intervention for their respective variables, the code of 1 in the interaction variable XiTi then represents the treatment group post-intervention, while 0 represents the rest of the other combinations (i.e. treatment group pre-intervention, control group post-intervention, control group pre-intervention). And for the DiD method, the b3 coefficient is in fact the only coefficient of interest in the entire regression analysis.

The statistical test on the b3 tells us whether the test scores of the treatment group post-intervention is different from all the other combinations aggregated, which in the perspective of DiD, is a representation of the treatment having an effect with time being accounted for.

Contrast Coding in Factorial ANOVA

However, if contrast coding is used like in 2×2 factorial ANOVA, the interpretation of the b3 coefficient becomes quite different. Due to the use of codes -1 and 1, the resulting interaction variable XiTi ends up grouping the treatment group post-intervention with the control group pre-intervention, and compares them against the treatment group pre-intervention with the control group post-intervention.

Sample data using contrast coding. Resulting XiTi value that is different from dummy coding is circled in red.

The statistical test on b3 the then tells us whether the comparison of treatment vs control is moderated by time (read more about moderation in my previous post). In layman terms, the result of this test informs us whether the difference between the treatment vs control differs across time (see graph below).

Pseudo-Experiments & Non-Random Assignment

The reason why economists use DiD is because the method was derived for analysing real world cases under pseudo-experimental conditions, which are more common in economic research. In the real world, groups selected for comparison are often not randomly assigned, which could potentially possess inherent differences. Hence, the DiD method is only interested in knowing whether the treatment group post-intervention is different compared against all other possibilities.

For lab experiments, however, random assignment is assumed to have taken care of extraneous variables, such that the observations pre-intervention should not differ between treatment vs control. Hence, in the case of 2×2 factorial ANOVA, any differences observed should only be explained by either the effect of group, or the effect of time. That’s why for 2×2 factorial ANOVA, b3 is not the only coefficient of interest — b1 informs us if there is an overall effect of group, while b2 informs us if there is an overall effect of time.

Graphical explanation of b1 , b2 and b3 in 2×2 factorial ANOVA.

Pairing of Observations in Within-Group Comparisons

At this point, you might be wondering, then what about repeated measures ANOVA? How is it different from factorial ANOVA, and why don’t economists use it instead? I didn’t express repeated measures ANOVA in a regression equation because it differs from factorial ANOVA in the sense that the observations between pre- and post-intervention are actually paired, the same way that paired t-test differs from independent t-test.

Technically, if the pre-post analysis is within-group, then the pre and post observations should be paired because they share the same variance. Unfortunately, why the DiD method doesn’t take that into account is something I have yet to figure out, and I’m also not sure if it affects the interpretation of the analysis.

Conclusion

Now the point of this article is not to argue which method is most superior. I personally believe that these various methods have been introduced to deal with different contexts, so what’s actually more important is to understand the context that we’re trying to analyse, then pick the most appropriate method to use.

To find out more about when you should use ANCOVA instead of repeated measure ANOVA, The Analysis Factor did a pretty decent job in explaining it succinctly here.

But perhaps my biggest takeaway from changing jobs so far, is realising that there is in fact so much to learn from other disciplines. If we choose to close our minds and not find out more, thinking that our way of doing something is the only way to do it, then it will only be our loss in missing out on a learning opportunity.

References & Further Reading:

--

--

YS Chng
DataSeries

A curious learner sharing knowledge on science, social science and data science. (learncuriously.wordpress.com)