Causal Graph Inference

adam kelleher
4 min readFeb 9, 2017

--

I’m back! I got a little busy, and ended up taking a short hiatus from blogging. I started teaching causal inference at Columbia during that time, and will start publishing my course materials soon! Let’s pick up the series where we left off: causal graph inference!

The world is a complicated place. It’s hard to know the unintended consequences of our actions. The problem, really, is that causes have effects, which themselves are the causes of other effects, and so on. It’s very hard to enumerate all the downstream effects of an action. With such an interconnected world, how can you keep it all straight? You might do an experiment which has a positive outcome on a variable you’re interested in, but has a negative effect in some unforeseen area! How can you have a reasonable expectation that you’ve accounted for all of the effects that you care about?

I’ve been writing this blog series about causality from the perspective of the Pearlian causal framework. The Pearlian framework uses graphs as the language of causality. It’s intuitive and pictorial, and lets you talk about causal pathways from one variable to another: if you can put together a chain of cause and effect going from X to Y, then X might have a causal effect on Y. In that framework, it’s easy to enumerate the consequences of actions. But how do you find the graph?

Finding the graph is the crux of the problem. There are a number of tools for it, and this post is going to focus on one of the more interesting ones. As I develop out this series, I’m going to try to bring some of this together into something that practically useful for data scientists. I haven’t finished my research yet into a broadly applicable set of guidelines for causal graph inference (comment if you have some suggestions!), so for now I’m going to focus on outlining some of the tools that I’ve found useful in practice.

I want to introduce a (relatively old) tool that gives you a partial causal graph from purely observational data! Astoundingly, there’s no experiment required, but there are some assumptions you have to make. I’ll eventually outline these as completely as I can, but for now will focus on a demonstration, and building some intuition for why it works. This algorithm is designed to work in the context of latent variables: that is, you don’t need to measure all of the variables that are operating in the system in order to infer causal effects. You don’t even need to measure all of the confounders!

First, since this is such a wild idea to a lot of people (that there’s an observational test for causation), I’m going to do a quick example. I’ve implemented the IC* algorithm in my causality package, so I’ll run through an example that just looks like magic.

An Example

In this example, we’re run the IC* algorithm on two graphs. They’ll have the same skeletons (set of edges when you take the arrows off), but with one key difference: in one graph, and edge will be due to an unobserved confounding variable. In the other, it will be a genuine causal relationship. Here are the graphs we’ll be working with:

Our two test graphs. On the left, there’s a genuine causal relationship between X4 and X5. On the right, the two are statistically dependent, but the dependence is due entirely to the confounding variable, X6. When we do graph inference, we’re going to omit X6 from our data set!!

We’ll generate each of these data sets. The first one generated is for the graph on the left, and the second is for the graph on the right.

The first output is the edgelist

where we’ll need to talk a little about the outcomes to understand what this all means. The basic picture is that it found this much of the graph:

and it believes that there’s an arrow from X4 to X5 which is genuinely causal. The others it isn’t really sure about. This is pretty good! We’ve inferred a causal relationship from observational data! What we really need, though, is not to infer one in the same place from the second data set.

That data set returned the edgelist:

The X4 X5 edge is no longer marked! Its direction was also reversed. The key here is that the algorithm no longer thinks there’s a genuine causal relationship! It has found that X4 and X5 are still correlated in a way that can’t be explained away by the data, but no longer can establish genuine causation.

This output looks like

which is pretty good, considering that there’s a latent confounding variable between X4 and X5 (X6)! My next post will go into much more detail about how this works. We’ll go into depth explaining the statistical criterion for establishing causality. In the meantime, you could read chapter 2 of Pearl’s Causality, or check out the paper here.

I should also mention there are many algorithms and a rich literature on causal graph inference. For implementations of many more, check out the TETRAD package from CMU!

--

--

adam kelleher

Physicist; formerly Data @ BuzzFeed; Adjunct Prof. at Columbia;