Does Nicholas Cage really cause drowning? A new approach to thinking about causality in observational studies

Daniel Christensen
9 min readFeb 12, 2019

--

Science has a problem with causality.

Maybe not the experimental scientists, but the ones with who deal with observational data. And that’s where a lot of the juicy problems are. Does smoking cause cancer? Does lead exposure harm children’s IQs? Does this government or educational program work or not?

The causal implications of research are also where so many of us get our hackles up. Another study telling me that red wine causes cancer/ doesn’t cause cancer, or that coffee expands lifespan by x years (1)???

The problem is pretty simple. Science is about causation (I’ll argue this below), but inferring causation in observational studies is really difficult, and scientists get stuck having a bet each way.

Nicholas Cage helps us understand why — observation alone can throw up associations just by chance. When we look at the association between the number of people who drowned by falling into a pool and the number of films Nicholas Cage appeared in (2), we recognise that association, the basis of much current social and health research, is ridiculous because somewhere in our minds we know there is no logical connection between the two. In other words, we are saying it’s implausible that Nicholas Cage movies caused drownings. Despite the impressive correlation (r = 0.67)*, we recognise this is just due to chance.

Because of this we are cautious about attributing causality. As a psychology graduate, I was told repeatedly “correlation doesn’t equal causation”, and “you can only infer causality from an experiment”.

But at the same time, science is inherently about causation. Scientific theory is about describing how the world works, and applied science is about using that information to make the world work a bit better. Both the theory and the application are implicitly causal — if I do A, B is a bit more (or less) likely to happen.

So, what happens is that when we write a paper, we carefully note in the Limitations section “this is an observational study, we cannot infer causal relations”. But then when we come to the Conclusion, we sneak some recommendations into our papers: “support mothers with mental health issues”, “don’t smoke”, “eat less fatty food” or “book reading is good for kids”. And what are these recommendations, if not causal assumptions in another guise?

That is, we are simultaneously affirming and denying causality. This ambivalence to causation is deeply seated, and it is there for good reasons. Unfortunately, the anxiety over causation has stunted discussion over what causation means, or what sorts of tools or points of view we can apply to better understand it.

We need an upfront discussion of what we mean by causation and when it is and isn’t reasonable to draw causal conclusions.

Fortunately, Judea Pearl and Dana Mackenzie address many of these issues in the new book, The Book of Why: The New Science of Cause and Effect (3). Pearl is a titan of mathematics and artificial intelligence, and Mackenzie is a journalist who has helped Pearl articulate his ideas. The book shines a light on the problems of inferring causality, and introduces a way of thinking about these problems that makes them a bit easier to deal with than before. It’s still not an easy read, but it’s much more accessible than Pearl’s earlier work, and well worth the effort.

The crux of this book is that without some acknowledgement of causation, we have no principled grounds for untangling spurious associations from genuine associations. For example, without a causal model of some sort, we don’t have a principled way of dealing with the Nicholas Cage and drowning example.

The birthweight paradox is another example of how the absence of a causal framework can lead us to some seriously misguided thinking. Did you know that smoking reduces infant mortality? It does if you include birthweight in the statistical model. Imagine writing trying to write evidence-based policy advice based on that paper.

The solution to this paradox comes from representing it causally: smoking causes low birthweight which causes infant mortality, and by ‘controlling’ for low birthweight we are taking part of the causal effect of smoking out of the model. Less intuitively (this is where the interpretation of ‘controlling for’ in regression coefficients gets tricky), by controlling for low birthweight we are comparing children with low birthweight due to smoking with children with low birthweight due to other causes such as serious birth defects (4). It turns out that within children with very low birthweight the children of smokers fare better than those children with very low birthweight due to other reasons.

So, ‘controlling for birthweight’, it appears smoking is protective. Of course, those children wouldn’t have had very low birthweight if it wasn’t for their mother’s smoking and so the regression coefficient is a causal (and public health) nonsense. What we should do instead is recognise that birthweight sits on the causal pathway from smoking to infant mortality and omit it from the model. That way, we can actually see the total causal effect of smoking on infant mortality.

Pearl has contributed to the discussion of issues like the birthweight paradox through championing the use of diagrams to represent causal relationships. These causal diagrams are also known as Directed Acyclic Graphs (DAGs). As in Directed (they have arrows), Acyclic (you can’t have feedback loops at a given point in time), Graphs.

The birthweight paradox example above is a simple DAG.

The concept is so simple it almost sounds silly. An arrow means something causes something else. In Pearl’s system, that means it raises the probability of it happening. (For example, although not all smokers get lung cancer, almost everyone who gets a lung cancer is a smoker). A missing arrow means two events are independent.

It might sound silly, but it’s actually brilliant. It’s a little tool that will lead to better science.

The application of these diagrams can help a working analyst like me decide what variables to adjust for when conducting an analysis. The following diagram illustrates the problem of ‘common cause confounding’.

Age is a common cause of both shoe size and vocabulary. That is, in as much as shoe size and vocabulary are both associated with age, we expect to see an association between shoe size and vocabulary. However, there is no arrow between them. In Pearl’s jargon, they are conditionally independent if we control for age.

So, if we asked about the causal relationship between shoe size and vocabulary but didn’t account for age in our model, we could end up with a misleading set of results. Whilst potentially predictive, this relationship is spurious (or noncausal). It seems unlikely that a theory of vocabulary development based on shoe size would advance scientific knowledge.

This might sound odd, but an omitted common cause may explain some of the confusion over Hormone Replacement Therapy (HRT). Basically, a large observation study (the Harvard Nurses study) found that HRT was protective of heart disease and osteoporosis (5). This was backed up with meta-analyses of other observational studies, and HRT was widely used to enhance health. By 2001, 15 million women in the US were filling HRT prescriptions annually (6). But there was nagging concern amongst clinicians that HRT was increasing mortality. Eventually after several randomised control trials, it found that HRT increased risks of heart disease, stroke, blood clots, and breast cancer (7). The trial was stopped due to the increased risks to participants.

How did the observational studies get it so wrong? One possible explanation of this is that there was a ‘healthy user bias’. That is, the same women who, in the observational study, chose to take HRT were also a bit more likely to floss their teeth, eat their veggies, not lying about smoking and so on. These healthy behaviours are also causes of decreased mortality (6, 8). In the causal lingo, we’d say ‘healthiness’ is a common cause (or confounder) of HRT and Cancer. Just like Vocabulary and Shoe Size, any association between HRT and Cancer that doesn’t take a common cause into account is biased. In some cases, these biases can be so large they can reverse apparent associations, as seems to have been the case with the nurse’s HRT study.

So, does Pearl offer us silver bullets which will fix to all known problems with causality? Will we be able to infer causality just because we have a causal diagram? Will using causal diagrams and a few heuristics protect us from the scourge of small-sample observations studies that get reported as causal (this week’s X-causes-cancer study)? No. No. No. But causal diagrams are useful tools. They make it a bit clearer what the assumptions are for causal inference. I’ve found in presentations that they are a great communication tool for opening up discussion with a general audience.

Others have picked up and run with Pearl’s approach. The epidemiologist and biostatistician Miguel Hernán has a wonderful series of video lectures on the topic (9).

Importantly, the competition of ideas in science means there is also a close look at where Pearl’s ideas (or the quest for causal inference more generally) fall down. The old concern of Garbage-In, Garbage-Out still applies. These techniques assume you have specified the correct causal model. So, you can come up with estimates, even if your model is completely wrong (10). A great picture of a bad theory is still a bad theory. And calling something ‘causal’ can wipe out some of the uncertainty we should rightly hold on to.

Despite the various criticisms, I’d still argue that causal diagrams are a wonderful representation tool. They make debate over assumptions, evidence and burden of proof a lot easier. I’ve also found that when reviewing other’s papers or listening to presentations, it helps boost my bullshit filter.

Having serious discussions about causality is exciting.

So, what can we do?

If you are a interested generalist, read Pearl and Mackenzie’s book. It’s not easy the whole way through, but it’s a big improvement on Pearl’s past work.

If you are a working analyst or researcher, read the book. Watch Miguel Hernán’s videos. Draw a picture. Use DAGitty (or similar) and take your best shot at drawing a causal model for the problem you are interested in. If there are competing theories, or you aren’t sure, draw competing diagrams. It makes your assumptions more transparent, and it will give you a clue as to what variables you need to adjust for (avoiding shaky interpretation like in the birthweight paradox).

As Miguel Hernán would say, ‘draw your assumptions before you draw your conclusions’. Even if you don’t end up using the picture(s) in your paper, you have a nice visual summary for conference presentations, you’ll understand the birthweight paradox and the relationship between Nicholas Cage and drownings. And, you run the chance of profoundly lifting your science game.

Acknowledgements

Thanks to Josephine Chandler, Joel Stafford, Kathleen Kennedy-Turner, and Sabian Wilde for reviewing versions of this paper.

* A correlation of 1 or -1 indicates a perfect one-to-one relationship between two variables, zero indicates no association whatsoever. 0.67 would be considered a strong correlation under most circumstances.

References

1. Ioannidis JA. The challenge of reforming nutritional epidemiologic research. JAMA 2018 doi: 10.1001/jama.2018.11025

2. Vigen T. Spurious Correlations 2018 [Available from: http://www.tylervigen.com/spurious-correlations accessed 28/08/2018.

3. Pearl J, Mackenzie D. The Book of Why: The New Science of Cause and Effect. UK: Penguin 2018.

4. Hernández-Díaz S, Schisterman EF, Hernán MA. The Birth Weight Paradox Uncovered? Am J Epidemiol 2006;164(11):1115–20. doi: 10.1093/aje/kwj275

5. Stampfer MJ, Willett WC, Colditz GA, et al. A prospective study of postmenopausal estrogen therapy and coronary heart disease. The New England journal of medicine 1985;313(17):1044. doi: 10.1056/NEJM198510243131703

6. Taubes G. Do We Really Know What Makes Us Healthy? N Y Times Mag 2007

7. Writing Group for the Women’s Health Initiative Investigators. Risks and Benefits of Estrogen Plus Progestin in Healthy Postmenopausal Women: Principal Results From the Women’s Health Initiative Randomized Controlled Trial. JAMA 2002;288(3):321–33. doi: 10.1001/jama.288.3.321

8. Colquhoun D. Diet and health. What can you believe: or does bacon kill you? 2009 [Available from: http://www.dcscience.net/2009/05/03/diet-and-health-what-can-you-believe-or-does-bacon-kill-you/ accessed 28/08/2018.

9. Hernán AM. Causal Diagrams: Draw Your Assumptions Before Your Conclusions 2018 [Available from: https://www.edx.org/course/causal-diagrams-draw-assumptions-harvardx-ph559x.

10. Krieger N, Davey Smith G. Response: FACEing reality: productive tensions between our epidemiological questions, methods and mission. Int J Epidemiol 2016;45(6):1852–65. doi: 10.1093/ije/dyw330

--

--