CAUSALITY: CAUSE AND EFFECT

Jatin Madan

Published in

AI Skunks

10 min readApr 23, 2023

Jatin Madan

Source: DALL-E with the prompt “A blue orange sliced in half laying on a blue floor in front of a blue wall”

Abstract

Consider the figure above. Yes, the orange. I’ll give you a moment.

The natural question is, “What does an orange have to do with causality?” Well, it’s funny because there is no relationship between oranges and causality. However, your mind, either intentionally or unintentionally, probably tried to make a connection.

It boils down to the basic question of why? Why an orange? Why did I click on this blog post? Why am I still reading this? The fact is this is we are constantly asking ourselves why. Our minds automatically try to weave together causal stories to help us make sense of a perhaps senseless world. Derivatives of this question go as follows. What is the cause? What is the reason for this? or What’s next? Where is this going?

What is a Causality?

Causality requires divining causal relationships from existing data while accounting for uncertainty, and similarly to prediction, it also relies on unproven assumptions that cannot be rooted in the ground truth. Let's take an example and understand the three steps of causal inference.

At first, we do statistical and predictive reasoning which covers most of what we do in machine learning. We may make sophisticated forecasts, infer latent variables in complex deep generative models, or cluster data according to subtle relations.

Example: a bank wishes to predict which of its current business loans are likely to default, so it can make financial forecasts that account for likely losses.

The second is interventional reasoning which allows us to predict what will happen when a system is changed. This enables us to describe what characteristics are particular to the exact observations we’ve made, and what should be invariant across new circumstances. This kind of reasoning requires a causal model.

Example: a bank would like to reduce the number of loans that default, and considers changing its policies. Predicting what will happen as a result of this intervention requires that the bank understand the causal relations which affect loan defaulting.

The third step is counterfactual reasoning where we can talk not only about what has happened but also about what would have happened if circumstances were different. Counterfactual reasoning requires a more precisely specified causal model than intervention.

Example: a bank would like to know what the likely return on loan would have been, had they offered different terms than they did.

Types of causal questions:

We start by broadening the set of causal questions we ask. Many teaching examples start with simple questions like “Does smoking cause cancer?” which is obviously quite useful to know but presumes we have already chosen a cause-effect pair to evaluate and have already measured them. There is a larger set of useful causal questions than “Does X cause Y?” and they come up all the time!

For either cause or effect, we may know the variable of interest in advance or not, yielding four possible types of causal questions.

Testing: Experiments occupy the bottom left quadrant: we know the cause (our product change) and the effects we want to look at (our metrics). These questions are often about estimating the strength of the relationship between the cause and effect so we can evaluate whether something is a success.
Explanation: Often called “root-cause analysis,” a frequent analytics activity is to notice an effect in the data (e.g. a drop in an important metric) and then conduct a search for the cause.
Characterization: In some cases, we know about the cause because some change was introduced (either by us or externally), and we would like to understand what the consequences are. It can be valuable to determine if anything unexpected has happened to uncover unforeseen costs or benefits.
Discovery: The most open-ended causal questions pertain to whether cause-effect relationships exist that we have not considered, but that matter to our business. There may be things we are doing that we haven’t studied which have hidden consequences that are good or bad for our business in the long term.

Observe Causality using CausalImpact

Let’s have a look at this with an example. Consider the following data on car drivers killed or seriously injured, and light goods drivers killed during the years 1969 to 1984 in Great Britain.

The number of car drivers killed or seriously injured during the years 1969 to 1984 in Great Britain.

Here, we will be using the CausalImpact library to estimate the causal effect of an intervention by specifying pre- and post-intervention periods and accounting for seasonality. The library employs Bayesian structural time series models to estimate the counterfactual outcome, allowing us to assess the impact of the intervention on the outcome variable.

pre_period  = [ pd.Timestamp('1969-01-01') , pd.Timestamp('1982-01-01') ]
post_period = [ pd.Timestamp('1982-02-01') , pd.Timestamp('1984-12-01') ]

ci = CausalImpact(xdat.loc[:,"car_ksi"],  pre_period, post_period, 
                  nseasons=[{'period': 12}], prior_level_sd=0.05)

print(ci.summary())

## OUTPUT##
Posterior Inference {Causal Impact}
                          Average            Cumulative
Actual                    7.25               253.69
Prediction (s.d.)         7.35 (0.04)        257.37 (1.41)
95% CI                    [7.27, 7.43]       [254.52, 260.06]

Absolute effect (s.d.)    -0.1 (0.04)        -3.67 (1.41)
95% CI                    [-0.18, -0.02]     [-6.36, -0.82]

Relative effect (s.d.)    -1.43% (0.55%)     -1.43% (0.55%)
95% CI                    [-2.47%, -0.32%]   [-2.47%, -0.32%]

Posterior tail-area probability p: 0.0
Posterior prob. of a causal effect: 99.5%

Summing up the individual data points during the post-intervention
period (which can only sometimes be meaningfully interpreted), the
response variable had an overall value of 253.69. By contrast, had the intervention not taken place, we would have expected a sum of 257.37. The 95% interval of this prediction is [254.52, 260.06].

The above results are given in terms of absolute numbers. In relative
terms, the response variable showed a decrease of -1.43%. The 95%
interval of this percentage is [-2.47%, -0.32%].

This effect is -0.1 with a 95% interval of [-0.18, -0.02]

This means the negative effect observed during the intervention period is statistically significant. If the experimenter had expected a positive effect, it is recommended to double-check whether anomalies in the control variables may have caused an overly optimistic expectation of what should have happened in the response variable in the absence of the intervention.

The probability of obtaining this effect by chance is very small
(Bayesian one-sided tail-area probability p = 0.0). This means the causal effect can be considered statistically significant.

2. Another example of causality in the context of the COVID-19 vaccination campaign could be assessing whether the start of the vaccination campaign has caused a reduction in the number of COVID-19 cases, hospitalizations, and deaths.

Total number of cases per million of COVID-19

Here, we will be determining the pre-and post-intervention periods based on when vaccinations started, and then using the CausalImpact library to estimate the causal effect of vaccinations on the total cases per million.

# when did vaccinations start => determine period limits
treatment_start = xdat['people_fully_vaccinated_per_hundred'].dropna().index[0]
raw_end = treatment_start - pd.Timedelta(days = 1)
raw_start = min(xdat.index)

treatment_end = max(xdat.index)

pre_period  = [ raw_start , raw_end ]
post_period = [ treatment_start , treatment_end ]

ci = CausalImpact(xdat.loc[:,"total_cases_per_million"],  pre_period, post_period, 
                  nseasons=[{'period': 12}], prior_level_sd=0.05)

## OUTPUT##

Posterior Inference {Causal Impact}
                          Average            Cumulative
Actual                    254428.93          162325656.82
Prediction (s.d.)         55358.76 (10067.0) 35318886.3 (6422748.98)
95% CI                    [35976.48, 75438.42][22952997.33, 48129710.71]

Absolute effect (s.d.)    199070.17 (10067.0)127006770.52 (6422748.98)
95% CI                    [178990.51, 218452.44][114195946.11, 139372659.49]

Relative effect (s.d.)    359.6% (18.19%)    359.6% (18.19%)
95% CI                    [323.33%, 394.61%] [323.33%, 394.61%]

Posterior tail-area probability p: 0.0
Posterior prob. of a causal effect: 100.0%

Summing up the individual data points during the post-intervention
period (which can only sometimes be meaningfully interpreted), the
response variable had an overall value of 162325656.82.
By contrast, had the intervention not taken place, we would have expected
a sum of 35318886.3. The 95% interval of this prediction is [22952997.33, 48129710.71].

This effect is 199070.17 with a 95% interval of [178990.51, 218452.44].

The above results are given in terms of absolute numbers. In relative
terms, the response variable showed an increase of +359.6%. The 95%
the interval of this percentage is [323.33%, 394.61%].

This means that the positive effect observed during the intervention
the period is statistically significant and unlikely to be due to random
fluctuations. It should be noted, however, that the question of whether
this increase also bears substantive significance and can only be answered
by comparing the absolute effect (199070.17) to the original goal
of the underlying intervention.

The probability of obtaining this effect by chance is very small (Bayesian one-sided tail-area probability p = 0.0). This means the causal effect can be considered statistically significant.

Observe Causality using DoWhy

DoWhy is an end-to-end library for causal analysis that builds on the latest research in modeling assumptions and robustness checks. Specifically, DoWhy is organized around the four key steps that are required for any causal analysis: Model, Identify, Estimate, and Refute. The model encodes prior knowledge as a formal causal graph, identify uses graph-based methods to identify the causal effect, estimate uses statistical methods for estimating the identified estimand, and finally refute tries to refute the obtained estimate by testing the robustness of the initial model’s assumptions.

Here, we have a dataset of user spending habits over a 12-month period which includes information about each user’s signup month, their spending in each month, and whether they belong to the treatment group (i.e., whether they signed up).

## Model a causal problem

import dowhy
from dowhy import CausalModel
model = dowhy.CausalModel(data=df_i_signupmonth,
                          graph=causal_graph.replace("\n", " "),
                          treatment="treatment",
                          outcome="post_spends")
model.view_model()

In DoWhy, a causal model is an object that encapsulates various aspects of the causal analysis process, such as the data, the treatment, the outcome, and the assumptions about the causal relationships among variables. One of the key components of the causal model in DoWhy is the causal graph, which is typically represented as a DAG.

A DAG in DoWhy is used to represent the assumed causal relationships among variables, visually displaying the structure of these relationships. The causal graph, as a DAG, is an essential input when creating an CausalModel object in DoWhy. It helps guide the identification, estimation, and refutation of causal effects.

## Identify a target estimand under the model
identified_estimand = model.identify_effect(proceed_when_unidentifiable=True)
print(identified_estimand)

##OUTPUT##

Estimand type: EstimandType.NONPARAMETRIC_ATE

### Estimand : 1
Estimand name: backdoor
Estimand expression:
     d                                   
────────────(E[post_spends|signup_month])
d[treatment]                             
Estimand assumption 1, Unconfoundedness: If U→{treatment} and U→post_spends then P(post_spends|treatment,signup_month,U) = P(post_spends|treatment,signup_month)

### Estimand : 2
Estimand name: iv
Estimand expression:
 ⎡                                                             -1⎤
 ⎢       d                      ⎛       d                     ⎞  ⎥
E⎢────────────────(post_spends)⋅⎜────────────────([treatment])⎟  ⎥
 ⎣d[Z  pre_spends]              ⎝d[Z  pre_spends]             ⎠  ⎦
Estimand assumption 1, As-if-random: If U→→post_spends then ¬(U →→{Z,pre_spends})
Estimand assumption 2, Exclusion: If we remove {Z,pre_spends}→{treatment}, then ¬({Z,pre_spends}→post_spends)

### Estimand : 3
Estimand name: frontdoor
No such variable(s) found!

In this step, different causal estimands are presented for the causal effect of treatment on the outcome. Each estimand is based on a different identification strategy (backdoor, instrumental variable, or frontdoor) and relies on specific assumptions about the causal relationships among variables. These estimands represent the quantities we aim to estimate in causal analysis, given the assumed causal structure.

## Estimate causal effect based on the identified estimand
estimate = model.estimate_effect(identified_estimand,
                                 method_name='backdoor.propensity_score_matching',
                                 target_units='att')
print(estimate)

##OUTPUT##

*** Causal Estimate ***

## Identified estimand
Estimand type: EstimandType.NONPARAMETRIC_ATE

### Estimand : 1
Estimand name: backdoor
Estimand expression:
     d                                   
────────────(E[post_spends|signup_month])
d[treatment]                             
Estimand assumption 1, Unconfoundedness: If U→{treatment} and U→post_spends then P(post_spends|treatment,signup_month,U) = P(post_spends|treatment,signup_month)

## Realized estimand
b: post_spends~treatment+signup_month
Target units: att

## Estimate
Mean value: 86.26916725642847

The output shows the average treatment effect on the treated (ATT) is approximately 86.27. This means that, on average, the treatment group’s post_spends increased by about 86.27 units compared to what they would have spent in the absence of the treatment.

##Refute the obtained estimate 
refutation = model.refute_estimate(identified_estimand, estimate, method_name='placebo_treatment_refuter',
                     placebo_type='permute', num_simulations=20)
print(refutation)

## OUTPUT##

Refute: Use a Placebo Treatment
Estimated effect:86.26916725642847
New effect:-6.033203585751362
p value:0.22788907814083004

Estimated effect: 86.27 (original causal effect estimate)
New effect: -6.03 (causal effect estimate with the placebo treatment)
p-value: 0.23

The new effect of the placebo treatment is significantly different from the original estimated effect, which is a good sign. The p-value of 0.23 is greater than the standard significance level (e.g., 0.05), indicating that we cannot reject the null hypothesis that the original estimated effect is due to random chance alone.

In summary, the refutation test with the placebo treatment provides evidence that the original causal effect estimate is robust and not due to chance alone.

Overall, causality is a fundamental concept that plays a critical role in scientific inquiry, policymaking, and decision-making. Establishing causal relationships allows us to draw more reliable conclusions, make better predictions, and design effective interventions to address pressing challenges in various domains.

References

License

All code in this notebook is available as open source through the MIT license.

All text and images are free to use under the Creative Commons Attribution 3.0 license. https://creativecommons.org/licenses/by/3.0/us/

These licenses let people distribute, remix, tweak, and build upon the work, even commercially, as long as they give credit for the original creation.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.