Causal Inference techniques for Panel Data in Retail — Part II

Shivangi Choudhary
7 min readDec 26, 2023

--

Hello readers! In my previous article on Comprehensive Guide on Causal Inference in Retail — Part I , I discussed about the assumptions , challenges and various causal inference techniques using observational data. Today I will discuss about the various causal inference techniques used in panel data, which is more relevant in scenarios where we want to measure the effect of any marketing or operational initiative .It helps assess the impact and effectiveness of a particular action by examining how key performance indicators (KPIs) or other relevant metrics have changed over time.

In retail , the intervention can be launch of a new product ( and we may want to measure the change in the NPS) ,new pricing strategy ( and their effect on sales volume etc), changes in store layout ( and its effect on customer’s ATV etc ). Lets now discuss some of the causal inference methods used for panel data ( time series data at different cross sections)

1. Pre — post analysis

It is used to find out , if there is a difference in observations before and after the intervention. Pre-post analysis is performed only for treated units, under the assumption that everything else remains constant in pre and post period.

  • We can use either paired t test ( for comparison between 2 time points) or Repeated measures ANOVA ( to make comparisons across multiple time-points and multiple groups), ANCOVA (an extension of ANOVA with added covariates).
  • Depending on normality and equal/unequal variance across groups, we can use either Welch’s t test or Mann-whitney’s t test, to get the ATE.
  • It focuses on measuring changes in outcomes /KPI s over time within the same group or population.
  • Pre -post analysis assumes that everything else is constant , in the pre and post time periods, which is seldom the case.

Ex :While analyzing the effect of store layout changes, we need to assume that everything else was same , before and after the change in layout like inflation, competition or any other macro economic factors

2. Difference in difference (DiD)

To overcome the assumption of pre-post method, DiD method has the assumption of parallel trend i.e it assumes that external factors follow the same trend in the pre and post period. It is commonly used to assess the effect of macro-interventions where the treatment is applies to a larger section , for example multiple regions /multiple brands etc. This method compares the difference in outcomes between two or more groups before and after an intervention, allowing for a comparison of changes in treatment and control groups. It can be applied in scenarios where we have 2 existing groups ( eg — Apparel and Shoes) which is not assigned by in a randomized trial set up, and the treatment is applied to one of the groups. Since this method relies on parallel trend assumption between the treatment and control group, we can model the diff across the treatment and control groups in both pre and post periods , under the assumption that any external factor in post period, has same effect on treatment and control

  • So DiD takes a treated unit before and after the treatment, and compares its trend with that of a control unit
  • It can be modelled as a regression problem, where we are only interested in the coefficient of (Xi*Ti) interaction term
  • DiD between treatment and control is basically the ATE (Avg Treatment Effect)
  • Presence of unobserved time varying confounders can cause the failure of parallel trend assumption
  • So , this method can’t be used if the trend in treatment and control units is different in pre period
Difference in Difference

https://www.publichealth.columbia.edu/research/population-health-methods/difference-difference-estimation)

Ex — In retail marketing campaigns, we can use DiD to get the actual impact of marketing campaigns, since in most of the settings , the control groups that are created as part of A/B testing , is very small (<5%) , and hence might not be the representative of the treatment group.

3. Fixed Effects Model

Fixed Effects (FE) models are used when we are interested in analyzing the impact if variables that varies overtime . FE can be used in scenarios where parallel trend assumption of DiD doesn’t hold good. It explores the relationship between predictor and outcome variables within an a group ( like country/ department etc). FE assumes that something within the individual may impact the outcome and we need to control for it. FE removes the effect of time invariant characteristics, so that we can get the net effect of the predictor variables on the outcome. FE also assumes that the time invariant characteristics should not be related with other predictor variables

  • FE regression models are used to avoid omitted variable bias . Entitty effects are included by including dummy variables for each entity type
  • For FE models the standard errors needs to be clustered and it assumes that causal effect is constant across each entity/segment and also controls for variables that are fixed for each time period but might vary across time( model.fit(cov_type = ‘clustered’, cluster_entity = ‘True, cluster_time = True)

Ex- in retail, FE models can be used in analyzing the various output metric of interest (sales/purchase pattern etc )t across different entities like customer segments, store grades, promotional events or even different products class/sub class. Thus it can be used to measure heterogeneity across entities. If we don’t model it using FE , then we will be creating variable omitted bias,

4.Random Effects Model

If the assumption of non-correlation of time invariant characteristics with individual characteristics is violated, then the error terms might be correlated , and the inferences drawn might not be true. In such scenarios, RE model should be used. Random effects assume that the entity’s error term is not correlated with the predictors , so that time-invariant variables,can be included as predictor variables

  • This RE should be used in scenarios where we have time varying confounders, where FE would fail ( like customer’s purchasing power changing over time)
  • We can use RE model to figure out if the output metrics are affected by different groups, and if there are any time variant confounders that we are not including in our model.
  • We can then further decide on the level of models that we need to build based on the heterogeneity of the groups (1 model for each group)

In retail , if we find that the items are different ( in terms of selling pattern/price elasticity etc ), but the difference is related to other predictor variables like department, subclass, price etc, then we can model this heterogeneity using FE, but while using random effects model , we are making the assumption that the Items effect is uncorrelated with other predictors. We use Hausman test, if we are not sure whether to use FE or RE.

5. Synthetic control Method

Synthetic control methods were originally proposed in Abadie and Gardeazabal (2003) and Abadie et al. (2010) with the aim to estimate the effects of aggregate interventions, in quasi experimental designs.It is most suited for interventions happening at an aggregate level affecting a small number of large units (such as cities, regions, or countries) like measuring the causal effect of taxation law changes etc. SCM typically uses a relatively long time series of the outcome prior to the intervention and estimates weights in such a way that the control group mirrors the treatment group as closely as possible

  • This method exploits the temporal variation of data rather than the spatial one.
  • We don’t need to find a single control unit , similar to treatment group ( like in Matching )
  • We can forge our own control unit , as a combination of multiple untreated units, creating a “Synthetic control)
  • Unlike Diff in diff , this method can also account for the effect of confounders changing over time, by weighting the control group better match the treatment group before intervention
  • It can be seen as a vertical linear regression, where the different control units are used as features (Xs) and different time intervals are taken as different units (Rows). So the control units are variables captured across different time frames . Outcome is calculated as the weighted average of the units
  • One of the assumptions of the synthetic control method is that the donor pool ( untreated units) are good candidates for the synthetic control. However there are times when this assumption doesn’t hold good . In such cases , we might want to use propensity scores to find good candidates or donor pool, before implementing SCM

Some of the use cases of SCM in online retail can be identifying the effect of launching a new feature on the website across a geography. . It can also be used to measure the effectiveness of automation or innovation in operations, where test and control setup is not possible or unethical ( to choose only certain customers for improved experience ) .In offline retail, it can be used to capture the causal impact of promotional events on certain stores/cities , before launching it at a national level

6. Synthetic DiD

Synthetic Diff in diff , tries to overcome the challenge of finding exact donor pool in SCM , to some extent. Synthetic DiD can be interpreted as a vertical regression line plus a horizontal line . The difference from scm is that it allows for an intercept and adds a penalty term. The intercept represents the fact that the goal of weights is no longer to perfectly match the treatment group, but only to mimic the trend of the treatment group

These are some of the commonly used techniques used for measuring causal impact in panel and pooled data. There are other scenarios where we combine propensity score matching ( discussed in my part 1) for selecting the features and then applying Synthetic control method. That’s all for now. Keep learning and keep writing !

--

--

Shivangi Choudhary

Senior manager ,Data science - Retail and Ecommerce Domain, aspiring to motivate others towards personal development and life skills