adam kelleher
2 min readSep 10, 2016

--

Of course! In each case, we’ll estimate the effect of X on Y.

First, for the chain graph X → A → Y, your set Z is empty. This is because there are no paths between Y and X with an arrow pointing into X. The set of such paths is just the empty set, and so the empty set of variables blocks the empty set of back door paths.

Next, we’ll look at the fork X ← A → Y. Now, there is a path between X and Y (don’t pay attention to the direction of the arrows) with an arrow that points into X. We need to block this path, so conditioning on A does the trick. Our set of variables to control for is just Z = {A}, the set containing A.

For the collider, X → A ← Y, and there are again no back-door paths into X. Conditioning on the empty set Z = {}, is the right thing to do. Note also that if we wanted to condition on A (sometimes, while not being minimal, conditioning on things doesn’t hurt), we can’t! It violates condition (i) of the back-door criterion. Conditioning on Z={A} would introduce bias in this case.

Things get a little more interesting when you work with bigger graphs. The more rigorous version of blocking is useful in that case, which I’ll give here. I avoided it in the article for obvious reasons: it’s pretty complicated.

You have have paths that are longer than these 3-node sequences that are blocked by containing such a sequence, where the middle node is in the sequence, as in criterion 1. I hope this helps!

--

--

adam kelleher

Physicist; formerly Data @ BuzzFeed; Adjunct Prof. at Columbia;