Diff in Diff Testing (Python)

3 min readOct 9, 2020

Many times in hypothesis testing we do not have a proper target and control set up. There will be a pre-existing bias between them. Diff in diff (DID) testing is a quasi-experimental method that helps us estimate the causal effect in such cases. Even though this is mostly employed for longitudinal data (same samples tracked at different points in time), this concept can be used for a variety of other applications as well.

Let’s say you have launched an offer (50% off on all products) during Thanksgiving and the sales of the product are at an all-time high. Great, the offer worked! But wait a minute, you notice that historically the product sales always peaks during Thanksgiving. In this case, you cannot randomly allocate a control that exactly matches your target performance before the launch of the offer. So how do you measure the actual impact of the offer? DID testing is the answer!

Terms

Target: Segment of customers who are exposed to the treatment (say, certain zip codes targeted for the 50% offer)
Control or Holdout: Segment of customers who are not exposed to the treatment (say, zip codes that are similar in performance to the target before offer launch)
Treatment: New method that is intended to be tested (50% offer)
Note: Since the target and control population are chosen according to the zip code and not designed individually for customers, there will inherent differences that exist even before the offer is launched

Methodology

Consider target and holdout (control) groups that are measured from two different periods. The aim here is to identify the true life because of the treatment. From our previous example, the intent here is to understand what is the incremental sales (actual lift) from the 50% offer.

Explanatory diagram for the 50% offer introduced for Thanksgiving

Clearly from the above diagram, the lift in Target sales (between October to November) is not purely from the 50% offer. There is a trend involved that has to be accounted for. The actual lift is calculated as follows:

Incremental sales in Target = Target sales in post-period (Nov) — Target sales in the pre-period (Oct)
Incremental sales in Holdout = Holdout sales in post-period (Nov) — Holdout sales in the pre-period (Oct)
Actual lift from the offer = Incremental sales in Target — Incremental sales in Holdout

This can be solved in a simple linear regression using interaction variables. In the below equation β₃ gives the actual change in the target with respect to the control group.

Simple regression expression and creation of dummy variables

In the above equation, a and t are dummy variables created using the above table. Here is the mapping for the regression:

y is the observed outcome variable (sales in our example)
t denotes time period (1 for post and 0 for pre)
a denotes the segment of interest ( 1 for target and 0 for holdout)
a*t is the interaction variable

Python Code

Sample python code available here

Things to keep in mind

This article will focus more on the applications of this technique rather than diving deep into the math behind it. DID comes handy when macroeconomic conditions impact the outcome variable. These unobservable conditions (weather, customer demand, seasonality, etc.) can sometimes tip the scale in the wrong direction. To minimise bias from uncontrolled variables

The test is based on the assumption that the target group and the control group approximately have the same trend before the treatment was introduced.

References

Diff in Diff testing lecture explaining the beta variables in detail
Example explained in Stata
Diff in diff and synthetic control

Diff in Diff Testing (Python)

Written by Sadhave S R