Generalised Regression Difference in Differences

with Fixed Effects, Multiple Treatment Periods and Dynamic Treatment Effects

Gerwyn Ng
eat-pred-love
6 min readDec 20, 2019

--

What is Difference in Differences?

Difference in differences (DiD) is a non-experimental statistical technique used to estimate treatment effects by comparing the change (difference) in the differences in observed outcomes between treatment and control groups, across pre-treatment and post-treatment periods.

DiD is commonly used to recover the causal effect of interest from observational study data — where the experimental design is out of the researcher’s control (e.g. natural experiment) and usually subjected to unobserved confounders and some form of selection bias.

Causal Pathway Illustration: Z (confounder) is the cause of both X (independent variable) and Y (dependent variable) and thus, obscures the relationship between X and Y.

The DiD estimator is, in fact, a version of the Fixed Effects Model which uses aggregated data. As such, its data requirements are much less and work wells with repeated cross-sectional data. In contrast, panel data are relatively more expensive and difficult to collect.

Intuition

DiD is a combination of time-series difference (compares outcomes across pre-treatment and post-treatment periods) and cross-sectional difference (compares outcomes between treatment and control groups).

There are two equivalent strategies to think about the two “differences”:

Mechanical Description

Visually, the DiD is the difference between these two changes:

  1. changes in outcomes from pre-treatment to post-treatment
  2. changes in outcomes between treatment and control groups
Graphical Illustration

Regression DiD

While it is possible to obtain the DiD estimator by calculating the means by hand, using a regression framework may be more advantageous as it:

  • outputs standard errors for hypothesis testing
  • can be easily extended to include multiple periods and groups
  • allows the addition of covariates

Simple DiD

The simplest setup involves observed outcomes for two groups for two time periods:

  1. The treatment group is exposed to treatment in the second period but not the first
  2. The control group is not exposed to treatment during either period

In this case, the treatment effect can be estimated by subtracting the average change in the control group from the average change in the treatment group. Doing so removes unobserved and fixed biases in the second-period comparison that could be the result of:

a. inherent differences between the two groups and,

b. changes in trend from first to the second period that is common to both groups.

With repeated cross-sectional data, the regression model can be defined as:

where y is the outcome of interest, P is a dummy variable for the second time period and T is a dummy variable for the treatment group. The interaction term, P × T, is equivalent to a dummy variable equals 1 for observations in the treatment group, in the second period.

The coefficients can be interpreted as follows:

The simple DiD estimator allows for the intercepts to vary between the treatment (β₀ + β₂) and the control group (β₀) and assumes constant outcomes within the two time periods (β₁).
  • β₁: Average change in y from the first to the second time period that is common to both groups
  • β₂: Average difference in y between the two groups that is common in both time periods
  • β₃: Average differential change in y from the first to the second time period of the treatment group relative to the control group
Graphical Illustration of Coefficients

By OLS, the simple DiD estimator is given as:

Generalised DiD: Fixed Effects and Multiple Treatment Periods

Since the regression framework allows us to add more covariates, we can formulate a more general DiD regression with allows for:

  1. the intercept term to vary for each cross-sectional unit (αᵢ instead of β₂)
  2. the common change in outcomes to vary across time (δt instead of β₁)
  3. different timing of the treatment for different treated units, (dt instead Pt)
  • αᵢ: Individual fixed effects that change across individuals (state-specific characteristics, individual’s gender, etc.)
  • δt: Time fixed effects that change across time (e.g. year dummies to allow intercept to vary across different years)
  • dt: dummy variable which equals 1 if the unit of observation is in the post-treatment period (in contrast to Pt equals 1 in the second time period)
Graphical Illustration: In the generalised model, we allow the intercepts to vary across both time and individuals

Generalised DiD with Dynamic Treatment Effects

To capture dynamic treatment effects, we allow our DiD estimator, β₃, to vary across time by using a time-variant coefficient, ρt, on the interaction term, Dt × Tᵢ.

where Dt = 1 in period t and 0 otherwise.

This specification allows us to capture leads and lags of the treatment.

Graphical Illustration: With dynamic treatment effects, the outcomes for the treated in the post-treatment period are no longer parallel to the assumed counterfactual

Keeping it Causal: The Parallel Trends Assumption

For our DiD estimator to be causal, we require the parallel trends assumption to hold. This assumption allows us to use the control group as a proxy for the assumed counterfactual trend (which is unobserved) of the treatment group. That is, in the absence of treatment, both the treatment and control groups should experience the same change in outcome.

Graphical Illustration: In the absence of treatment, the treated should observe the same trend as the control group. If this assumption holds, we can then attribute the change in the trend of the treated, in the presence of treatment, to the treatment itself.

Threats to Causal Inference using DiD

Although the DiD method is intended to mitigate the effects of confounders and selection bias, it may still be subjected to threats that can invalidate its causal inference.

Violation of the Parallel Trends Assumption

The most important assumption is violated when there exist unobserved factors that are correlated with both treatment status and timing of the treatment. In other words, there could be factors (besides the treatment) that cause a change in one group but not the other at the same time as the treatment. Various issues that could compromise the results include autocorrelation and the Ashenfelter dips (where treatment is assigned primarily based on pre-existing differences in outcome).

While the Parallel Trends Assumption is typically untestable (since counterfactual is never observed), the following strategies are often applied to improve the credibility of results:

  • Verify that pre-treatment trends are parallel for a reasonable timeframe
  • Cross validate results using alternative control or treatment groups
  • Test if there exists a treatment reversal effect

Compositional Difference

Because DiD is often implemented using repeated cross-sectional data, significant changes in the composition of the samples across periods may simply reflect changes in the composition of each group, rather than the treatment effect.

Extrapolation

Depending on the setting of interest, results may be unable to generalise to other populations or even a longer timeframe.

Implementation with R

Since the DiD estimator is a version of the Fixed Effects Model, the DiD regression may be modeled using a Fixed Effect Linear Regression using the lfe package in R.

The dummy syntax is as follows:

felm(causal relation of interest | fixed effects | IVs | clusters, data = your_data)

An example:

  1. The dependent variable, y
  2. The independent variable whose coefficient is of interest, x
  3. Fixed effects, f1 and f2
  4. Endogenous variables Q and W instrumented by z1 and z2
  5. Clusters, c1 and c2, to compute standard errors which allow for correlation within all combinations of c1 and c2
#import
library(lfe)
#fit your regression
felm(formula = y ~ x | f1 + f2 | (Q|W ~ z1 + z2) | c1 + c2, data = df) %>%
summary(robust = TRUE) #display results w/ robust standard errors

References

[1] Angrist and Pischke (2008). Mostly Harmless Econometrics: An Empiricist’s Companion. Princeton University Press.

[2] C. Junxing (2019). Empirical Economics Course Materials.

[3] Imbens and Woodridge (2007). Difference-in-Differences Estimation. NBER

[4] J. Pischke (2019). Differences-in-Differences. LSE

--

--