Using causal network theory as a scaffold for scientific intuition

Ferhan Dack
Met Office Informatics Lab
5 min readSep 30, 2021
Photo by Wes Hicks on Unsplash

Climate scientists often use causal reasoning informally, but formalising it can have a couple of advantages. First, it makes it easier for others to grasp their assumptions and follow their argument. Second, it helps to apply the same reasoning to complex scenarios fluently and with ease, simplifying the analysis.

These advantages can be gained by moving from reasoning informally with causal narratives, to reasoning formally with causal diagrams.

Diagrammatic reasoning makes it possible to capture complex chains of reasoning by the application of a few simple visual rules. Once a climate scientist has developed a theory of the causal relationships of the indices they are considering, by drawing the network explicitly they can use these rules to determine which covariates to include and to exclude for their specific analysis. These rules are demonstrated by using examples from climate science, in the “Quantifying causal pathways of teleconnections” paper, which was recently published by BAMS (Bulletin of the American Meteorological Society).

Bulletin of the American Meteorological Society (BAMS)

The following researchers from Met Office’s Informatics Lab and University of Reading came together to collaborate on this paper:

Marlene Kretschmer (Department of Meteorology, University of Reading),

Samantha V. Adams (Met Office Informatics Lab),

Alberto Arribas (Microsoft Europe),

Rachel Prudden (Met Office Informatics Lab),

Niall Robinson (Met Office),

Elena Saggioro (Department of Mathematics and Statistics, University of Reading),

Theodore G. Shepherd (Department of Meteorology, University of Reading)

To highlight the major points of this paper, let’s take a look at the abstract:

“Teleconnections are sources of predictability for regional weather and climate but the relative contributions of different teleconnections to regional anomalies are usually not understood. While physical knowledge about the involved mechanisms is often available, how to quantify a particular causal pathway from data is usually unclear. Here we argue for adopting a causal inference-based framework in the statistical analysis of teleconnections to overcome this challenge. A causal approach requires explicitly including expert knowledge in the statistical analysis, which allows one to draw quantitative conclusions. We illustrate some of the key concepts of this theory with concrete examples of well-known atmospheric teleconnections. We further discuss the particular challenges and advantages these imply for climate science and argue that a systematic causal approach to statistical inference should become standard practice in the study of teleconnections.”

Purpose

In a nutshell, the purpose of this paper is not to detect causal relationships, but to make climate scientists aware of the need to bring explicit causal reasoning into their statistical analysis of teleconnections. Existing knowledge about physical mechanisms can be used to quantify teleconnection pathways and distinguish between direct and indirect pathways of influence. The paper includes some concrete examples of prominent teleconnections such as the influence of ENSO on California precipitation.

Summary

Throughout this paper, causal networks are used to represent teleconnections to facilitate quantification of the strength of the pathways. The authors advocate for a formal causal framework in the statistical analysis of teleconnections, which can be obtained by grounding it in causal inference theory (as seen in the paper’s info box). Statistical analysis in weather and climate science is usually done in the context of physical, hence causal, reasoning, but this reasoning is often only informal.

This is illustrated with a number of well-known teleconnection examples. Most of the examples are shown using Multiple Linear Regression (MLR), but all the concepts extend naturally into the non-linear context, as is demonstrated with a final example.

Some practical challenges for the use of a causal framework in climate science are also discussed in the paper by the authors. For instance, the various temporal scales of dependencies in the climate system can be difficult to address, as loops and cycles are generally not permitted in a causal network.

Another challenge is that relevant processes should be included in the network, meaning that no confounders exist there. This can never be fulfilled for an object as complex as the climate system. On the other hand, the same challenge (and limitation) exists for any statistical analysis.

Bearing in mind the concerns above, as regards to teleconnections, there is usually enough expertise available to draw a plausible causal model to articulate physical hypotheses. The authors argue that every climate science paper invokes, in some way, physical hypotheses, as any statistical study design depends on causal assumptions. Causal networks turn these claims into testable objects, making research conclusions more transparent and traceable.

Wrapping Up

Photo by FLY:D on Unsplash

The transparent and rational nature of causal network analysis can help in overcoming many of the limitations faced in current studies, and in reconciling differences between the conclusions of different studies. However, a causal approach is not meant to compete with traditional climate model experiments or physical theory. Instead it serves as a “scaffold to build scientific intuition into the statistical analysis of the data. Both basic physics as well as data science are needed to make progress in climate science, and causal theory is a framework for better reconciling the two.

Many thanks to Rachel Prudden, Sam Adams, Marlene Kretschmer and Ted Shepherd for providing background information.

Further information and links:

--

--