Image: From left to right: Lars Kunze, Jeong-Yoon Kim, Antoine Bertoncello, Judea Pearl, Franziska Bell, Candice Hogan, Harsha Tharkabhushanam, Robert Ness, Totte Harinen, Ben Vincent

bp’s Causal Inference Symposium: Discussing the next frontier in AI

Franziska Bell
bp tech blog

--

By Fran Bell, senior vice president for digital technology and
Natalia Konstantinova, staff data scientist

Artificial intelligence (AI) has taken the world by storm. Users, employees and companies have benefitted from its applications and it has become part of our everyday lives.

As with many companies, and also at bp, we are using AI and data science across many of our businesses. Whilst traditional machine learning and deep learning have unlocked benefits — as we all know — they do so by using correlation and other methods like pattern recognition, which have little sense of robust cause and effect.

This could lead to significant failures when it comes to correlational models. As a result, bp’s AI strategy is centred around human-machine teams, where we leverage the best of both humans and AI. Machines can quickly analyse and find patterns across large volumes of data, while people are great at creativity, intuition, and abstract reasoning. By leveraging the strengths of both humans and machines, we can make better and faster decisions. Importantly, this human in the loop approach allows subject matter experts to correct and augment any shortcomings of AI.

In addition, we are interested in and have started to apply approaches that can systematically overcome AI’s current limitations. Incorporating domain expertise and developing models that capture cause and effect is especially vital in high-stakes decision making, such as in the energy industry, healthcare, economics and the development of autonomous driving.

We also believe that the lack of cause-and-effect in these models is the biggest blocker for AI being able to reason and attain human-level intelligence.

Causal inference from observational data is an approach that helps data scientists deduce cause-and-effect relationships from observed data, by going beyond traditional statistical correlation, to understand the impact one variable has on other variables. Causal AI refers to the integration of causal inference principles and methodologies into artificial intelligence systems.

The father of causal inference, Professor Judea Pearl, says that today, only one in every 1,000 data scientists study the science of cause and effect.

Despite its criticality and potential, causal AI has not received as much attention as it deserves. Only a handful of experts have spoken about it at major conferences, and the father of causal inference, Professor Judea Pearl, says that today, only one in every 1,000 data scientists study the science of cause and effect.

So, in September 2023, bp invited causal AI heavyweights from leading academic institutes and industry to meet virtually for a Causal Inference Symposium, which was attended by more than 900 people from academic institutes and companies.

Our aim was to make space for the causal AI discussion and bring great minds together to solve problems that matter. Given the importance of causal AI to the energy industry and beyond, bp is keen to build a global community on causal inference and help foster the sharing of knowledge and new methods, as part of our efforts to build safe, reliable and useful AI that provides value for all.

At our symposium, we had a fantastic line-up of data scientists and researchers from the University of Oxford, Microsoft, Airbnb, Uber, Toyota Research Institute, TotalEnergies and, of course, bp, share examples of how causal inference can create value and unlock opportunities.

We can hardly talk about causal AI without understanding its origins and the impact Professor Pearl’s work has had on statistics, medicine, social sciences and computer science over the last two decades.

And who better to tell us about it than the pioneer himself. Below is a summary of his keynote.

Causal hierarchy and the two laws of causal AI

Turing award winner Professor Judea Pearl who gave the keynote speech at bp’s 2023 Causal Inference Symposium

Today, big data is governed by the paradigm that all wisdom is derived from data, and therefore, it is our job to fit the data the best we can, and the answers produced are a result of the data being a good fit for the questions we ask.

The scientific approach is entirely different — the wisdom comes from the model of the world, rather than from data. In this paradigm, we ask: “What should the world be like before I can answer a research question about that world?”

Professor Pearl advocates that modern data science needs to satisfy both model assumptions and data and data scientists should learn and train themselves in building Structural Causal Models (SCM).

In his talk, Professor Pearl introduced the audience to several fundamental concepts that form the foundations of modern causal inference.

One of the fundamental concepts he spoke about is the ladder of causation, which has three levels:

  1. Association
  2. Intervention
  3. Counterfactuals

The first rung (“association” or “seeing”) is about observing patterns and only requires data.

The second rung (“intervention” of “doing”) answers what-if questions, for example, what would happen to sales volumes if we changed the price? This requires both data and a model of the underlying causal structure.

The third rung (“counterfactuals” or “imagining”) is imagining an alternate outcome to something that happened. Here’s a real-world example: “I have lung cancer. Would I still have gotten cancer if I had stopped smoking two years ago?”

Ladder of causation

Importantly, mathematical proofs show that it requires information from the same or higher rung in the ladder of causation to answer questions. This means that associative models, such as classical statistical models, current machine learning and deep learning models, will not be able to provide reliable answers to what-if-scenario modeling, no matter how much data we feed them.

The ladder of causality, as mentioned above, relies on two fundamental principles of causality:

· The law of counterfactuals (and interventions), and

· The law of conditional independence

The former helps understand and establish ways to compute counterfactualism, along with the respective probabilities, while the latter helps arrive at a visual representation of the model’s structure around data and assumptions.

The next section will cover what counterfactuals and conditional independence are.

The law of counterfactuals

The law of counterfactuals can be summed up by the formula:

Law of counterfactuals formula

Y: This represents the outcome variable of interest. It’s the variable you’re trying to understand or predict.

X: This is a specific value or level of the variable X, which is typically a treatment or an exposure variable. In the context of causality, X is often considered the cause or treatment, and Y is the effect.

U: This represents a set of background or observed variables. These are the conditions under which you’re evaluating the potential outcome. U might stand for “unobserved” or “background” variables that can affect the outcome.

Mx: This represents a model or a mechanism. In the context of interventions, Mx refers to the mechanism that governs how Y responds to changes in X. It captures the relationship between the cause (X) and the effect (Y).

Yx (u): This is the potential outcome of Y under the hypothetical condition that X takes on a specific value x, given the background conditions u.

YMx (u): This is the potential outcome of Y under the hypothetical intervention that you actively set X to the value x, given the background conditions u, and Mx is the model or mechanism describing how Y responds to changes in X.

Once you have elucidated this model, you can compute interventions or counterfactuals, no matter how complex.

The law of conditional independence (also known as d-separation)

The law of conditional independence is the second fundamental principle of causal AI and the foundation for several models.

We can take a well-known example of rolling dice. For instance, when rolling two dice, it’s typically assumed that the dice operate independently.

Observing the outcome of one dice doesn’t provide information about the outcome of the other, indicating independence. However, if you learn an additional detail, such as the sum of the two results being even, and the first die shows a 5, this extra piece of information limits the possibilities of the second die having produced an odd number.

Essentially, while two events can be independent, they might not be conditionally independent and conditional independence in this instance relies on the characteristics of a third event.

Individual causal effects — observational studies augmenting randomized experiments

After providing an overview of the fundamentals of causal inference, Professor Pearl shared his recent work on individual causal effects and how causal inference from observational data can provide valuable insights even when outputs from randomized experiments are available.

In Professor Pearl’s conceptual example, he described how a combination of observational and experimental studies can improve personalized decision making in the healthcare domain, compared to just experimental data alone.

When introducing new treatments, healthcare policymakers have to make crucial decisions about which population groups are likely to experience the greatest advantages from them. In Professor Pearl’s example, survival data from an experimental drug trial showed that the drug was equally effective on both male and female patients.

However, after also incorporating observational data, where people have free choice on whether they take the drug, the results differ significantly. The benefit to harm ratio has greatly increased for female patients.

This conceptual example showcases that observational data can add information with respect to individual responses, beyond that provided by a randomized experiment.

Conclusions

“The next revolution will be even more impactful once we realize that data science is the science of interpreting reality, not summarizing data,”
Professor Pearl concluded.

In summary, Professor Pearl emphasized the importance of looking beyond associations and encouraged the audience to apply counterfactual reasoning. His talk certainly gave us a lot to think about how causality can and should be used in the real world.

For further reading and more useful case studies on how his theories can be applied to other industries, we’d highly recommend you check out The Book of Why written by Professor Judea Pearl and Dana Mackenzie.

More on bp’s Causal Inference Symposium

bp’s first Causal Inference Symposium took place in Q3 2023. In addition to Professor Judea Pearl’s keynote, the symposium featured sessions from a range of technical experts:

We heard from Airbnb data scientist Dr Totte Harinen about justified assumptions; Microsoft AI researcher Dr Robert Ness on how large language models can be leveraged for causal inference; PyMC Lab’s Dr Benjamin T Vincent on how Bayesian networks can improve answers by adding uncertainty quantification; Uber’s Dr Jeong-Yoon Lee on causal machine learning open source tools and TotalEnergies’ Dr Antoine Bertoncello on the realities of causal inference in the energy industry. The day closed with a lively panel discussion between Oxford University’s Dr Lars Kunze, Toyota Research Institute’s Dr Candice Hogan and Dr Harsha Tharkabhushanam, representing bp.

We have many more reflections from bp’s Causal Inference Symposium to come in a subsequent blog post, so do stay tuned.

To read more about how bp uses technologies to create change, please subscribe to our blog.

Dr Franziska Bell, senior vice president for digital technology at bp

Dr Franziska Bell is the senior vice president for digital technology at bp, leading software and platform engineering, data & analytics and design & change management for the bp group, and chairing bp’s centre of excellence for accelerating digital technology and cultural transformation. Fran is a recognized digital expert and an award-winning doctor of theoretical chemistry who received her PhD at the University of California, Berkeley, before working as a post-doctoral scholar at the California Institute of Technology. Before joining bp in 2020, Fran’s ability and passion for bridging digital and physical fields saw her lead Uber to transform every employee into a data scientist at the push-of-a-button and deliver innovation and digital products at top speed for low cost, and drive battery and fuel cell materials discovery and artificial intelligence research for Toyota Research Institute.

Dr Natalia Konstantinova, staff data scientist at bp

Dr Natalia Konstantinova is a staff data scientist at bp and is part of the digital technology centre of excellence. She is an AI enthusiast with almost 15 years’ experience in the application of Natural Language Processing, Artificial Intelligence, tech and machine learning to real world problems. Her role is to develop standards and best practices to accelerate the adoption and implementation of Machine Leaning enabled solutions, to help various parts of the business to come up with data strategy to drive data & analytics adoption. Natalia got her PhD from the University of Wolverhampton and has worked in various fields such as machine translation, ontologies, information extraction, dialogue systems, chat bots and general machine learning. Natalia is a strong believer that modern technology can transform businesses and our everyday life.

--

--