Causal reasoning: A fairly overlooked piece in artificial intelligence

Published in

Intel Student Ambassadors

6 min readDec 3, 2018

Causal Reasoning: A Fairly Overlooked Piece in Artificial Intelligence

Artificial Intelligence (AI) is helping solve many complex real-world problems. With AI, email spam filtering, image captioning, speaking with Alexa (and other devices), and many other applications that were unimaginable not long ago, have all become possible. The increasing pace of AI development is heartwarming in the sense that it would help solve many of our problems on a daily basis. One should, however, also keep an eye on the (fairly) under-explored aspects of AI and ask themselves: What is one of the next steps in AI? Let us look at a few opinions where scholars of AI point to some suggestions for future research.

Scholar Opinions

In a recent interview with the Quanta Magazine (click here), Judea Pearl, Turing Award winner and professor of computer science at UCLA, known for his significant contributions to probabilistic AI and Bayesian Networks, points out that much of the current common practice in AI “amounts to just curve fitting.” By curve fitting, he by no means underestimates the current practice in AI, but (from my understanding) refers to the task of finding associations between variables. For example, given a big data base of images of human cells, it would be valuable to design an automated system that elicits the (associational) relationship between whether or not a cell is cancerous and, for example, its shape. Current non-causal AI methods are, to some extent, capable of finding associational relationships although challenges still exist. However, as informative as these associations are, they fall short of providing information about the causal relationship between variables, and thus the famous mantra “association is not causation.” An example is the high association between chocolate consumption and winning a Nobel prize in many countries (click here) that implies the more chocolate you eat, the more likely you are to win a Nobel prize. Of course, this would not make a lot of sense, but what did we miss? The answer to the issue, here, is that such a relationship between chocolate consumption and winning a Nobel prize is merely associative and must not be interpreted as if eating chocolate causes winning a Nobel prize. Another example is the significant correlation between storks population and baby deliveries in Germany (click here), which again, must not be interpreted as causal and hence, relating the storks population to baby deliveries is a consequence of a lack of proper causal analysis.

There are other opinions where scholars call for attention to causal reasoning in AI. In an article in the New York Times (click here), Gary Marcus, professor of psychology and neuroscience at NYU, and Ernest Davis, professor of computer science at NYU, write about the problem of fake news on Facebook (FB) and reason that for us to (actually) be able to detect fake news using AI, causal reasoning has to play a role in our developed AI systems. They argue that without a proper causal analysis, fake news detection is not possible in general. Michael Jordan, professor of computer science and statistics at the UC Berkeley, also points in an article on the Medium (click here) to “the need to infer and represent causality” in future AI systems. Instances of articles and published papers about the importance of causal reasoning in AI are not hard to find.

What is Causal Inference?

Now, let us say we would like to dive in causal reasoning. First things first, one might ask “what kind of questions are causal?” In fact, they are not farfetched and we face them in our every day lives. Questions such as:

“Would the ad on my website get more clicks had its background color been red instead of blue?” or
“Would my FB news feed look much more like the way I want it to, had I answered FB’s survey questions?” or
“Would my blood pressure be lower had I consumed less salt?” or
“What if I consumed whey protein instead of creatine after workout, would I gain more muscle?” or
“Would I not get lung cancer had I not smoked?”

…and many more are the types of causal questions. The first question asks about the causal effect of background color on ad clicks, the second about the causal effect of answering survey questions on the quality of one’s FB news feed, the third about causal effect of salt on high blood pressure, the fourth about the causal effect of supplements on muscle gain in workout, and the fifth about the causal effect of smoking on lung cancer. In general, many “what if” questions which ask about what would have happened had we taken some alternative action comparing with what we already did, have a causal sense. Answering these questions requires moving beyond association and undertaking the mathematics of causal reasoning, a deed that is fortunately ongoing in the state-of-the-art research in artificial intelligence and other disciplines of science such as statistics, epidemiology, economics, etc.

How to Solve Causal Inference Questions?

The science of causal reasoning has roots and is developing, in various disciplines, notably philosophy (going back to as old as Aristotle), epidemiology, economics, statistics, computer science, etc. Currently, two of the notable well-established frameworks that provide solid mathematical infrastructures for causal reasoning are the following:

1) Structural Causal Models (SCM) developed by Judea Pearl which enables us to infer causality via Directed Acyclic Graphs (DAG) (click here). The mathematics of intervention, a necessary component for inferring causality, are provided in SCMs through the “do-operator.” When assessing whether X is causing Y, one learns, through this framework, that in general, Pr(Y=y|X=x)≠ Pr(Y=y|do(X=x)), where do(X) denotes an intervention on the random variable X and Pr(.) denotes the probability of a random variable (note that here, I only considered whether X is causing Y, such relationship is asymmetric in a causal sense and hence, whether Y causes X needs a separate causal analysis). The left-hand side of the formula above points to conditional probabilities (not causal in general), whereas elucidation of a causal relationship between variables requires incorporation of intervention (right-hand side of the formula) and using mathematics of the “do-calculus.”

2) The “potential outcomes” framework (click here) formulated and developed by Donald Rubin (and originally proposed by Jerzy Neyman), professor of statistics at Harvard, which is commonly referred to under the names of Rubin Causal Model (RCM), or the Rubin-Neyman causal model. In order to elucidate whether X causes Y, this framework, speaking at a high level, contrasts the potential outcomes (Y) of an interventional experiment in which a data point (say, an individual) is exposed to different levels of X. For example, if we are to examine the causal effect of Advil on headache, we would contrast the outcomes (whether one has a headache) of an interventional experiment treating (taking Advil) and controlling (not taking Advil). The gold standard experiment for finding such contrast of outcomes is a Randomized Controlled Trial (RCT), but RCTs are not always feasible. The details of experimental design as well as those of how to infer causality in the absence of RCTs is out of the scope of this article, but I would be more than happy to discuss further with interested readers.

The two frameworks, the SCM and the RCM, use two fairly different “representations” (if I may) and sets of assumptions for causal reasoning, but at the end of the day, are axiomatically translatable to each other and eventually serve the same goal, which is extracting causal relationships between variables. Despite the availability of such well-established frameworks, causal reasoning in artificial intelligence has not matured yet and many difficult challenges are unresolved. Many of the current AI systems are, in general, incapable of such ability. Fortunately, however, several AI researchers, including our group at Pennsylvania State University, are actively working in this area. We hope that with causal reasoning, AI systems would improve much more and would be able to help solve more complex real-world problems.

Disclaimer: The contents of this article represent only my opinions and do not extrapolate to those of the scholars mentioned above.

Written by Aria Khademi