Subtle Variations in Selection Effects and Conditioning

Henry Kim
Henry Kim
Aug 28, 2017 · 4 min read

I’m a bit stunned that I did not come across this article penned by a sociologist in the Atlantic before, since this comes very close to how I tend to approach statistical topics. (h/t to Siobhan McAndrew)

The intuition behind instrument variables that they don’t “really” teach you in an intro econometrics class (they sort of do, but not in a manner that makes an intuitive sense — speaking from experience, it’s partly because those doing the teaching sometimes don’t have a good intuition themselves. I know I didn’t until several years had passed.) is that instrument variables simply introduce a “conditioning factor” that refracts the distributions for the subset of the data that is affected by the instrument.

A classic application of this sort of thinking shows up in a paper about the effect of radio on a New Deal program (a neat paper, but I cannot remember the details for my life, alas!). The argument is that the introduction of radio had caused (or at least a strengthened) a linkage between the policy and the electoral results. While the easy correlation to look for would be of the “more radio → more electoral linkage,” the rate at which radio was being introduced, at least among the general population in the Appalachias (I think) may have been correlated with socio-economic conditions that lead to greater policy-electoral linkage. I imagine that those might be things like education and wealth (potentially capturable through other variables, perhaps) or greater interest in politics or current events (much harder to easily capture). The author used a clever instrument — a particular geological condition that affects radio waves in the American Southeast. The presence of this condition makes radio useless and you won’t get radios even if you are a politics junky. Where the condition is not present, political junkies get radio more than non-political junkies. Where the condition is present, even political junkies don’t get radios. So the results found is that, where the geological condition is present, the policy-election linkage is much weaker, while all other variables are held equal. Perhaps not a perfect instrument: maybe political junkies move out of the areas where the radio is not working well and buy radio where they can listen to it — but it’s a lot harder than just not listening. Basically, the effect uncovered is that, where for reasons not related (much) to “politics,” writ large, the intervening variable does not show up, the correlation between the independent and dependent variable is weak. Where the intervening variable does show up, the correlation is much stronger. So the intervening variable, indirectly, “caused” the effect — i.e. favorable geological condition → more radio → greater policy-election linkage.

In this case, the causal story is quite plausible: where geology prevents radio, limited correlation, holding constant politico-social conditions (this is a big if — if the conditions are unobserved, they are being held constant only by faith); where geology encourages radio, strong correlation. So radio → effect. Presumably, geology does not select on “politics.” But a lot of things we see do. People who vote do so because they “know” politics better and buy into the conventions of politics. Students who sign up for political science classes do so because they are interested in “politics.” As per the example in the article, people who support a political party do so for some (but not necessarily all) of the things that that particular party is associated with. To continue with the article example, in the general population, those who favor legalizing marijuana also tend to support economic redistribution. Among the Republicans, this linkage goes away. Of course, the reason for this is simple: people who like one but not the other still have half a reason to fit into the Republican Party. People who like both have no reason to support the GOP. This, of course, is more than just a matter of dissecting the data, but an insight into how social processes work — everything operates by introducing some filter, a selection bias (machine learning simply formalizes it and, in a way, makes it “fairer” and cruder, by taking away a lot of squishy nuances that are understated). This reveals, or conceals, relationships that exist in different parts of the (whole) data.

This also helps explain how institutions evolve and/or tear themselves apart. For many Republicans, continuing with the pot vs. redistribution example, the attachment to the Republican Party is only partial: many are in it either because they oppose pot or because they oppose redistribution. Making the Republican Party the party of overt hostility to both pot and redistribution risks alienating a significant chunk of their coalition. Of course, the “real” Republicans would want to push this, but they do so at the risk of driving their own party to ruin. The task of the leaders is to ensure that these true partisans are kept at bay so that their party is never hostile to both simultaneously. Now, in the past few decades, proverbially speaking, both parties have turned against both pot and redistribution (i.e. bow to the “true” partisans). What does this tell us about state of politics today?

)
Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade