Causality: Crash Course

AI Skunks
Published in
5 min readApr 9, 2023

Causality refers to the relationship between an event (the cause) and a second event (the effect), where the second event is a result of the first. In other words, causality is the concept that one event can be understood as causing another event to occur.

Causality is crucial in machine learning because it allows us to understand and make predictions about the effects of specific actions or interventions. Without understanding causality, we may be unable to distinguish between coincidence and true relationships between variables, which could lead to inaccurate predictions and unreliable decision-making.


Correlation vs. Causation

  • Correlation is a measure of the strength and direction of a linear relationship between two variables.
  • Causation, on the other hand, implies that one variable directly affects another variable.
  • Correlation does not necessarily imply causation; it is possible for two variables to be strongly correlated without one causing the other.
  • Causation can only be established through randomized experiments or carefully designed observational studies that account for confounding variables.

Limitations of correlation analysis

  • Correlation analysis assumes a linear relationship between two variables and may not capture non-linear or complex relationships.
  • Correlation analysis does not account for confounding variables that may affect both the dependent and independent variables, leading to spurious correlations.
  • Correlation analysis may be influenced by outliers or extreme values, leading to erroneous conclusions.

Need for Causal Inference

  • Causal inference is necessary when we want to establish a cause-and-effect relationship between two variables.
  • Machine learning models can identify associations and predict outcomes, but they cannot establish causality without carefully designed studies or experiments.
  • Causal inference is important in fields such as healthcare, economics, and policy-making, where interventions and decisions have real-world consequences.
  • Techniques for causal inference include randomized controlled trials, natural experiments, and observational studies with statistical adjustments for confounding variables.

Causal Inference

Types of causal inference

  1. Predictive inference aims to make predictions based on patterns in data but does not attempt to establish causal relationships.
  2. Structural inference involves modeling the underlying mechanisms that generate the data and can be used to make causal inferences.
  3. Counterfactual inference involves comparing outcomes under different interventions or treatments to estimate causal effects.

Potential outcomes framework

  1. The potential outcomes framework is a mathematical framework for defining causal effects in terms of potential outcomes.
  2. It assumes that each individual has a potential outcome under each possible treatment, and the causal effect of a treatment is the difference between the potential outcomes.
  3. The framework allows for the estimation of causal effects even when only one treatment is observed.


  1. Counterfactuals refer to the outcomes that would have been observed if a different treatment had been administered.
  2. Counterfactuals cannot be directly observed but can be estimated using causal inference techniques.
  3. Counterfactuals can be used to estimate the causal effects of different interventions or treatments.

Causal diagrams

  1. Causal diagrams, also known as directed acyclic graphs (DAGs), are a graphical representation of causal relationships between variables.
  2. They can be used to identify confounding variables, selection bias, and other sources of bias in observational studies.
  3. Causal diagrams can be used to guide the selection of variables to control for in the statistical analysis.

Estimating causal effects

  1. Randomized controlled trials are the gold standard for estimating causal effects, but may not always be feasible or ethical.
  2. Observational studies can be used to estimate causal effects using techniques such as propensity score matching, instrumental variables, and regression discontinuity designs.
  3. Machine learning algorithms can be used to estimate causal effects by incorporating causal assumptions into the model.

Example equation for the average treatment effect:

ATE = E[Y(1) - Y(0)]

where ATE is the average treatment effect, Y(1) is the potential outcome under treatment, Y(0) is the potential outcome under no treatment, and E is the expected value.

Causal Machine Learning

Causal models in machine learning

  1. Causal models in machine learning involve incorporating causal assumptions into the model.
  2. These models can be used to estimate causal effects and make predictions under different interventions or treatments.
  3. Causal models include Bayesian networks, structural equation models, and potential outcomes models.

Interventions and counterfactuals in ML

  1. Interventions in machine learning refer to changing the input to the model to observe how the output changes.
  2. Counterfactuals in machine learning refer to predicting what the output would have been if the input had been different.
  3. Interventions and counterfactuals can be used to estimate causal effects in machine learning models.

Challenges in causal machine learning

  1. The curse of dimensionality can make it difficult to estimate causal effects in high-dimensional data.
  2. The presence of hidden confounders can lead to biased estimates of causal effects.
  3. The lack of a clear causal model can make it difficult to interpret the results of causal machine-learning models.

Example code in Python for computing counterfactual predictions using a linear regression model:

import numpy as np
from sklearn.linear_model import LinearRegression

# Generate some data
X = np.random.normal(loc=0, scale=1, size=(100, 2))
y = 2*X[:, 0] + 3*X[:, 1] + np.random.normal(loc=0, scale=0.1, size=100)

# Fit a linear regression model
model = LinearRegression().fit(X, y)

# Predict counterfactuals
X_cf = np.array([[1, 2], [3, 4]])
y_cf = model.predict(X_cf)
print(f"Counterfactual predictions: {y_cf}")

Example equation for estimating the average treatment effect using a causal model:

ATE = E[Y(1) - Y(0) | X = x] = E[Y | do(T = 1), X = x] - E[Y | do(T = 0), X = x]

where ATE is the average treatment effect, Y(1) is the potential outcome under treatment, Y(0) is the potential outcome under no treatment, X is the set of covariates, T is the treatment variable, do(T = t) is the intervention that sets T to t, and E is the expected value.

Application of Causal ML

  1. Healthcare: Causal ML for treatment effects, personalized treatment plans, readmission prediction, identifying disease risk factors, and improving patient outcomes.
  2. Marketing and Advertising: Causal ML for measuring marketing effectiveness, optimizing advertising strategies, identifying drivers of customer behavior, and developing targeted marketing interventions.
  3. Public Policy: Causal ML for policy evaluation, identifying effective policies, and optimizing resource allocation.
  4. Social Sciences: Causal ML for predicting social behavior, identifying causal factors of crime, and analyzing the impact of education on social mobility, and voting behavior.
  5. Finance and Economics: Causal ML for estimating causal relationships between economic variables, predicting economic outcomes, developing investment strategies, and optimizing risk management.


Importance of causality in ML

  • Causality helps us understand the underlying mechanisms and drivers of the data, which is crucial for decision-making and intervention.
  • Correlation does not imply causation, and relying solely on correlation can lead to incorrect conclusions and ineffective interventions.
  • Causal inference techniques can help us estimate the causal effects of interventions and make more informed decisions.

Future Directions for causal machine learning research

  • Developing more robust and scalable methods for causal inference in high-dimensional and complex data.
  • Integrating causal inference with other ML techniques such as reinforcement learning and deep learning.
  • Extending causal inference to incorporate time-varying treatments and outcomes.
  • Incorporating domain knowledge and human expertise into causal inference models.


  1. “Causal Inference in Statistics: An Overview” by Judea Pearl, Statistics Surveys (2009) —
  2. “The Book of Why: The New Science of Cause and Effect” by Judea Pearl and Dana Mackenzie (2018)
  3. “Causal Inference: The Mixtape” by Scott Cunningham —
  4. “Elements of Causal Inference: Foundations and Learning Algorithms” by Jonas Peters, Dominik Janzing, and Bernhard Schölkopf (2017)
  5. “Causal Machine Learning: A Survey” by Kun Zhang, Léon Bottou, and Quanquan Gu, arXiv preprint (2020) —