What we should consider about causality? -Conceptual Framework-

8 min readNov 13, 2023

Introduction

When questions such as “Did the medication treatment have an impact on the patient’s recovery?”, “Did the campaign increase product purchases?” or “What improvements could be made to increase store sales?” are given to ask, causal relationships can be considered as a tool to provide solutions to these problems.

Even though the action are usually taken to try to solve them with interpreting causal relationships, it is quite difficult to scientifically correctly identify causal relationships.

It has been considered that the human brain innately has a function of considering causality and performs it almost unconsciously. However, when it comes to conduct causal inference from those who are not experts in data science, it is sometimes seen a discrepancy with interpretations of the causal relationships derived from human’s intuition.

An easy-to-understand and well-known example should be raised as the problem of “confounding.” Looking at the following plot [1], which visualize the relationship between chocolate consumption and Nobel Prize winners by country, it appears that countries with high chocolate consumption also tend to have a high number of Nobel Prize winners. The correlation coefficient between the two numbers is also high at 0.79. However, can we make a decision based on these results to “promote chocolate consumption in order to increase the number of Nobel Prize winners” ? Intuitively, we might find it strange to conclude that increasing chocolate consumption would lead to more Nobel Prize winners.

There are two possible reasons. First, it is regarding the “Chocolate Consumption” and “Novel Prize” themselves. The above plots and correlation coefficients do not determine the direction of cause and effect. Therefore, it is impossible to conclude from the above analysis whether “as chocolate consumption increases, the number of Nobel Prize winners increases’’ or “as the number of Nobel Prize winners increases, chocolate consumption increases.’’

The other is spurious correlation due to confounding. Considering the unobserved indicator nominal GDP per capita (GDP), it can be found that countries with high GDP may tend to have high chocolate consumption and Nobel Prize winners. In other words, since GDP influences both chocolate consumption and the number of Nobel Prize winners, there appears to be a relationship between chocolate consumption and the number of Nobel Prize winners. In this way, confounding caused by other possible factors (which may or may not be observed) may lead to incorrect interpretations about the target relationship.

Three ladders for interpreting causal relationships

In the book titled “The Book of Why: The New Science of Cause and Effect’’ by Judea Pearl, one of the masters of causal reasoning, it is stated that there are three levels of cognitive ability described as the following picture in order to understand causal relationships. Let’s explore one by one below.

WHY

HOME PUBLICATIONS BIO CAUSALITY PRIMER WHY DANIEL PEARL FOUNDATION

bayes.cs.ucla.edu

Source: The Book of Why Chapter1 Figure2

1. Association - Seeing / Observing

The first stage of cause and effect is called as Association. This refers to making observations and finding rules in the surrounding environment based on the results. Associations can be described scientifically by summarizing data using statistical approaches like correlation or regression etc. The causal question raised here is typically “If A is observed…’’. One of the possible and typical approaches to state the answer is by conditional probability that describes the probability of an event occurring based on observation, in other words, it is given the meaning; “If observed and taken into account…’’.

For example, given that the question arises, “What is the probability that a customer who bought toothpaste also buys dental floss?” At this step, it is answered by the proportion of customers who bought toothpaste also buy dental floss, that is the conditional probability described as P(Dental floss | Toothpaste).

However, since data is nothing more than facts recorded mainly in numerical values, this Association approach is basically unable to answer the question “Why did this happen?’’. Furthermore, in the typical theory of machine learning that learns the trends shown by such data, it can only learn what the data can teach. Therefore, it requires to get observation results and explicitly add to the learning process in order to respond to new phenomena.

2. Intervention - Doing

In order to move up the ladder of cause and effect, predicting the effects of intentionally changing the environment, called as Intervention, might be introduced to get the information that is impossible to acquire through only observation.

Considering the toothpaste and dental floss example from earlier again. The question at this step is, “If the price of toothpaste are doubled, how would sales of dental floss change?”. Now, it should pay attention that there is a big difference between “observing that the price of toothpaste has doubled” and “doubling the price of toothpaste.” It is difficult to observe changes in the price of toothpaste and conclude that it is the cause of changing the sales of dental floss. This is because the observed situation has no way of knowing information about other unobserved factors, that should make it difficult to obtain a comparison target for the price of toothpaste while controlling for those environments.

On the other hand, in intervention, it is possible to generate similar situation as an experimental environment using data which is close to information obtained through observation and which can be key to explain the environment surrounding an event and a causal model that describes the environment structurally. Furthermore, as mentioned above, questions regarding intervention can be expressed by a description using the intervention operator do as a symbolic language. Therefore, if a change occurs in the target variable as a result of an intervention, it can be indicated that the intervention is the cause of the result of the target variable, and it can be found that there is causal relationship between the variable targeted by the intervention and the target variable.

At the end of this section, it will be summarized the differences between the first and second rungs of the causal ladder. The situation which assume at the first stage observes that the price of dental floss has changed by a factor of X while the price of toothpaste has doubled. However, there is no clear direction of cause and effect here. In other words, it is not possible to conclude whether “The price of toothpaste has doubled, so the price of dental floss has changed by X times,” or “The price of dental floss has doubled, so the price of toothpaste has doubled.” are valid. The second rung of the ladder, on the other hand, involves implementing an intervention for one target, such as doubling the price of toothpaste here. If we therefore observe a change in the price of dental floss (based on the assumption that other factors that could cause the change are controlled for), it is a causal effect of the change in the price of toothpaste, which may be able to conclude that there is a causal relationship between the two variables.

3. Counterfactuals - Imagining

Up to the second stage, effects on the entire population or on typical individuals selected from the population have been treated as causal effects in the description of cause and effect in the observable domain. The third and final rung of the ladder introduces the idea of Counterfactuals, which spotlights events that is unobservable. In addition that, it allows us to estimate the causal effects through the comparison of distinguishing the consequences of facts and counterfactuals for particular individuals thereby.

The question to be asked here is, “What would happen if I didn’t do this?” By answering this question, it can be understood the theory behind a certain result and thought about a different pattern of events than what actually happened.

Here, it is important to note that data represents “facts’’ and is extremely incompatible with consideration of counterfactuals. For example, when aspirin is administered to a certain patient A, it is not possible to observe from the data whether the patient did not administer aspirin at the same time. Therefore, it is difficult to describe a counterfactual using a simple approach based on data alone. In this situation, it is impossible to answer the question, “If Patient A, who was given aspirin, had not been given aspirin at that time, would his headache have been cured?”

Causal models are expected to be an effective approach for such situation. A causal model can structurally describe the relationships that exist between target events. Therefore, since it is possible to draw a “theory” of the entire environment, the effect of that change on the overall structure can be grasped based on a causal model, which will provide answers to counterfactual questions, even if a change in the situation occurs somewhere within the environment.

Tools for describing causal relationships

Up to this point, it has been asked different questions at each step in order to climb the ladder of cause and effect, and it has been organized the ways of thinking to answer them. Furthermore, several tools for approaching each causal question has been presented in addition to data.

One of them is a causal model described by a causal diagram, and the other is a symbolic language for theoretically writing causal relationships. A causal diagram uses points (shown as box in the following sample) and arrows to organize and depict what is known about a problem situation. Based on this causal diagram, the target phenomena are structurally organized as a causal model. Based on the visually explicit relationships, causal questions can be expressed using symbolic language. For example, when expressing the influence of a certain treatment Z on a target variable Y, it can be expressed as P(Y|do(Z)) using the intervention operator do, which is a symbolic language expressing intervention.

The example of causal model described by causal diagram

These methods allow us to scientifically express the consideration of causal relationships that human beings usually do unconsciously. Therefore, it can be said that it has great meaning if it is considered that there have the potential to contribute to the elucidation and generalization of such systems of thought.

The difficulties of causal model and Causal Discovery

To explain causal effects with using causal diagrams, it is required to start with identifying the direction of causality between two variables to be stated. But, it might be difficult to establish all the causal directions in the causal structure with multiple measurements at the same time. [2]

We cannot necessarily define all causal structures related to the phenomenon of interest as hypotheses. For example, there can be the situation which is completely unknown about the direction of the causal relationship between A and B among the three variables A, B, and C, or even worse, unclear whether there is a causal relationship at all. In order to make inferences using causal models in such situations, hypotheses regarding the causal structure can be constructed based on the data stored at hand. This approach to inferring a causal graph from data is called Causal Discovery.

That concludes this time! In the next article, one of the causal discovery approaches will be introduced and the way to define the causal structure will be explained.

Reference

[1] Franz H. Messerli, M.D. Chocolate Consumption, Cognitive Function, and Nobel Laureates, 2012

[2] Judea Pearl, Dana Mackenzie. The Book of Why: The New Science of Cause and Effect, 2018

[3] Staplin, Natalie; Herrington, William G.; Judge, Parminder K.; Reith, Christina A.; Haynes, Richard; Landray, Martin J.; Baigent, Colin; Emberson, Jonathan. Use of Causal Diagrams to Inform the Design and Interpretation of Observational Studies: An Example from the Study of Heart and Renal Protection (SHARP), 2017