What is Causal Machine Learning?

A Gentle Guide to Causal Inference with Machine Learning Pt. 2

Published in

Causality in Data Science

6 min readFeb 27, 2023

Causal Machine Learning seems to be the most trending new buzzword in Data Science at the moment. But what is it really? In this blog series, we give a gentle introduction for newcomers to causal inference.

Causal Machine Learning — The Name of the Game

In the last few decades, researchers have developed more and more sophisticated artificial intelligence (AI) techniques, leading to increasingly more powerful algorithms and spectacular events such as the victory of Alpha Go. Utilizing swaths of high-dimensional and heterogeneous data sources, more and more high-performance applications entered the market and data science was proclaimed to be the sexiest job of the 21st century. Everybody has heard something about Deep Learning and Artificial Intelligence, Deep AI, Strong AI, etc. — a heaven for buzzword bingo lovers. AI seemed to soon become superhuman.
Yet, one flaw remained. All these powerful systems lack interpretability. And what’s worse, the more powerful they got, the less interpretable they became.

Simultaneously, academics warned about using the trained models in critical real-life scenarios, because the models are only as reliable as the data on which they have been trained. They criticized that modern machine learning is a very powerful correlation-pattern recognition system that works well on data from the same distribution that it was trained on — but it is incredibly vulnerable to distribution changes.

The solution to this problem seemed terribly easy. Simply replacing correlation-based recognition systems with causality-based machine learning should seal the deal as some companies advertise. This simple but of course very attractive idea of combining highly-powerful machine learning methods with solid causal knowledge led to the hype around the buzzword “Causal Machine Learning” which sounds as if it would magically solve the last problem of AI.

While this might sound amazing, it is important to stick to some scientific pessimism.
One could argue that the stage we are currently in resembles typical inflated expectations when a new promising technology is introduced. It is undoubtedly true that the connection between causal inference and machine learning will lead and has already led to powerful methods that catalyze important developments in ML. But still, the actual potential of the technology is not fully understood.

While we do not want to underestimate the potential and societal importance of causal machine learning, we still want to stick to the academic reality. Said reality is that research on causal machine learning, while opening up possibilities to causal-based and thus more resilient models, is still in its infancy. At present, causal inference methods have mostly been developed for low-dimensional and relatively simple-structured data and are mostly focused on identifying causal relationships (causal discovery), or assuming them in a causal model and quantifying their effects (causal effect estimation). There are further exciting developments that we may cover in later blogs. For now and without claiming the status of a definition, we can call these two aspects the two pillars of current causal learning.

What is Causal Inference?

In the most basic terms, causal inference is a discipline to formalize the pursuit of identifying, modeling, and quantifying causal relationships. For some people, this might sound terribly easy as humans are very good at intuitively understanding and separating a cause from its effect. For example, after suffering severe pain when coming close to boiling water, you have an intuitive understanding of the cause and the effect. From a statistical perspective, distinguishing cause and effect is not that straightforward. The world is full of associations between variables, but not all of them are causal. Given this, one could say that the goal of causal inference is to identify and quantify the true causal nature of a world full of associations.

The Connection between Association and Causation

When we speak of an association between two variables in our data set, we mean that both variables are statistically dependent, this may be either a positive or negative correlation, or a more general nonlinear dependency. Assuming a world without coincidental dependencies, every association can only be explained by two phenomena. Either one of the variables is the cause of the other, or they are both caused by an unobserved latent variable. This is called Reichenbach’s Common Cause Principle (Hitchcock et. al, 2020).

The very famous example of ice cream and sunburns makes this very clear. Although sunburns and ice cream consumption are highly correlated, they are not causally associated. The variable that leads to this associational connection is the hot shining sun during the summer months which causes both sunburns and ice cream consumption to rise.

Towards Causal Machine Learning

While it is very easy to distinguish between cause and effect in the ice cream example, disentangling causal association from mere statistical association becomes a non-trivial task when confronted with an observational data set full of correlations between various variables about whose relationships you have almost no knowledge. In exactly these settings, the methods of Causal Inference will be your tools to solve this big puzzle of associations. Causal Machine Learning then combines this with the power of machine learning and deep learning to deal with high-dimensional and complex data. Given this understanding, Causal Machine Learning has three value propositions:

With the tools and assumptions of Causal Discovery, you will be able to learn causal graphs that represent the causal relations of the underlying system that generated the data you observe.
With the tools of Causal Effect Estimation, you will be able to quantify and estimate causal effects based on qualitative knowledge of the causal graph using causal inference frameworks such as the do-calculus.
Causal Machine Learning then brings these together with deep neural networks and the like to provide models fit for the complexities of a causal world. Here are two examples of this endeavor by the pioneers Judea Pearl and Bernhard Schölkopf.

To understand how all of these tools work and how you can apply them in your own data science projects, make sure to read the rest of this series. Besides explaining the fundamentals of causal inference to give you some orientation, we will also explain and discuss emerging methods and trends, serving as a Guide towards Causal Machine Learning. Hope you stick around.
Thanks for reading!

For more details on Reichenbach’s Common Cause Principle have a look at:

Hitchcock, C., & Rédei, M. (2020). Reichenbach’s common cause principle. ISO 690

and

Neal, B. (2015). Introduction to Causal Inference.

About the authors:

Kenneth Styppa is part of the Causal Inference group at the German Aerospace Center’s Institute of Data Science. He has a background in Information Systems and Entrepreneurship from UC Berkeley and Zeppelin University, where he has engaged in both startup and research projects related to Machine Learning. Besides working together with Jakob, Kenneth worked as a data scientist at BMW and currently pursues his graduate degree in Applied Mathematics and Computer Science at Heidelberg University. More on: https://www.linkedin.com/in/kenneth-styppa-546779159/

Jakob Runge heads the Causal Inference group at German Aerospace Center’s Institute of Data Science in Jena and is chair of computer science at TU Berlin. The Causal Inference group develops causal inference theory, methods, and accessible tools for applications in Earth system sciences and many other domains. Jakob holds a physics PhD from Humboldt University Berlin and started his journey in causal inference at the Potsdam Institute for Climate Impact Research. The group’s methods are distributed open-source on https://github.com/jakobrunge/tigramite.git. More about the group on www.climateinformaticslab.com