A summary of our causal machine learning manuscript in NeurIPS 2019
I am excited to announce that our paper Integrating Markov processes with structural causal modeling enables counterfactual inference in complex systems by Kaushal Paneri, Olga Vitek, and myself will be published in the proceedings of NeurIPS 2019.
I have been working with Olga for years on trying to tie seemingly orthogonal ideas into one narrative. Kaushal helped us bring it together with the implementation and simulations, which are described in the Slideshare at the bottom of this post. In this post, I provide a high-level summary of this work, who might be interested in giving it a read, and why.
What is this paper about?
This work illustrates how to do counterfactual reasoning and more robust intervention prediction with dynamic models. In our paper, we work with a special kind of dynamic model called a Markov process model, also called stochastic kinetic models (Wilkinson 2011), though the intuition is not specific to that particular case.
Dynamic models can simulate system behavior over time. A typical use case for simulation is predicting the outcomes of interventions, meaning perturbations to that system.
However, if the dynamic model is badly misspecified, these intervention predictions can differ dramatically from what would happen if the intervention were applied in real-world settings, such as in a controlled experiment. We illustrate how to make intervention predictions more robust to model misspecification errors.
We start by showing how to convert a dynamic model to a structural causal model (Pearl (2000)). The structural causal model (SCM) framework, combined with a suitable probability model, allows you to reason counterfactually on past data. For example, you can ask, “given that I applied this intervention A and got this outcome, what outcome would I have gotten with intervention B?”
We showed that answers to this question are more robust to model misspecification than simple intervention prediction questions like “what would happen if I applied intervention B?” We did this with an algorithm for simulating ground-truth counterfactual scenarios from a dynamic model. We demonstrated that our modeling procedure could recover the ground truth when the model is correctly specified and can get closer than pure intervention prediction when it is incorrectly specified.
Who might find this interesting or useful?
Researchers working on systems or synthetic biology. Our work focuses on systems biology as a motivating example, where dynamic modeling is a common choice for modeling cell-level behavior. Sites like BioModels.org provide dynamic models as files that can be downloaded then loaded and used for simulation in software like Matlab. In this domain, an intervention might be a drug that inhibits the activity of a specific cellular protein.
Simulation-based modelers in econometrics, epidemiology, and other domains. The Markov process models we focus on are a particular case of dynamic models. The meaning of the term “dynamic modeling” overlaps with terms such as mechanistic modeling, mathematical modeling, computational modeling, agent-based modeling, and other types of simulation-based modeling approaches. Practitioners who work with these types of simulation-based modeling would likely find some useful insights in our work.
Reinforcement learning researchers interested in counterfactual reasoning. Some recent work has examined the use of SCMs for counterfactual policy evaluation in reinforcement learning settings (Buesing et al. (2018), Oberst and Sontag (2019)). One of the difficulties with this approach is that the functional form of SCMs themselves cannot be learned from data. In general, for a given dataset, there is a set of SCMs that differ in mathematical specification and yet would assign the data the same likelihood. Due to these differences in how they are specified, these SCMs could provide different answers to a counterfactual query. How do you know which SCM to select from this equivalence class?
This work addresses this identifiability issue by using basing the math of the SCM on prior knowledge. Dynamic models are comprised of discrete components that react with one another continuously in time according to a set of rules. The mathematical form of SCM is derived directly from these rules.
In many reinforcement learning settings, such as board games, the rules are clear. However, our work illustrates what to do when the rules are not so clear. In our examples, the rules and the resulting SCM come from a model of enzyme kinetics. Enzyme kinetic models by no means capture the full complexity of ground truth biophysics. However, a good enzyme kinetic model has the smallest set of biochemical rules needed to capture the behaviors that an enzyme biologist considers essential.
This suggests a useful principle in causal modeling; to build a causal model that can help you reason counterfactually about your system, start by enumerating the micro-level rules needed to simulate the macro-level behavior that matters to you.
Researchers trying to link deep learning and causal reasoning. We implemented the SCMs in our examples using Pyro (Bingham et al.), a PyTorch-based probabilistic deep learning framework from Uber AI. We use Pyro’s algorithms for Bayesian inference to do counterfactual inference. This work presents an excellent prototype for causal modeling with deep learning frameworks.
- Bingham, E., Chen, J. P., Jankowiak, M., Obermeyer, F., Pradhan, N., Karaletsos, T., … & Goodman, N. D. (2019). Pyro: Deep universal probabilistic programming. The Journal of Machine Learning Research, 20(1), 973–978.
- Buesing, L., Weber, T., Zwols, Y., Racaniere, S., Guez, A., Lespiau, J. B., & Heess, N. (2018). Woulda, coulda, shoulda: Counterfactually-guided policy search. arXiv preprint arXiv:1811.06272.
- Oberst, M., & Sontag, D. (2019). Counterfactual Off-Policy Evaluation with Gumbel-Max Structural Causal Models. arXiv preprint arXiv:1905.05824.
- Pearl, J. (2000). Causality: models, reasoning and inference (Vol. 29). Cambridge: MIT press.
- Wilkinson, D. J. (2011). Stochastic modelling for systems biology. CRC press.