As task-specific AIs are being widely implemented to solve narrow and well-defined problems, the pinnacle of AI research remains the development of an artificial general intelligence (AGI). The “generality” of an AI can be defined not so much as a system that possesses a vast array of knowledge and skills, but one that has the ability to acquire, improve, and apply an infinite amount of them. AGI will have the ability to learn and adapt to an unlimited number of situations, and produce flexible solutions and actions based on a minimal amount of input data. The pursuit of this high level intelligence will result in an increase in productivity and value in a wide range of industries and larger society.
The media and Hollywood like to speculate on when we’ll reach The Singularity, the point at which machine intelligence is on par with self-aware, flexible human intelligence. However, figuring out how to accurately model the human brain into a framework that machines can use is deemed near impossible by some in the AI community. Those that agree with Turing’s Wager believe there is something deeply magical about how ions move across membranes in the brain, and that it’ll likely be a long time before we understand that motion well enough to be able to model it. The Relativistic Brain is a recent book that introduces a new theory of how our brains work, and why they will never be simulated on a digital computer. Given the consensus around the complexity of the brain, one can see why we might be hundreds, if not thousands, of years away.
At Toyota AI Ventures, we’ve begun to see a number of startups building generalized AI systems. The cores of these systems stem from pure research topics, including explainability; generative networks (for both images and language); robustness (adversarial networks); self-supervised learning; graph networks; neural architecture search (AutoML); and intention prediction. One topic that has been gaining attention recently is causality, the idea of a machine’s ability to understand cause and effect.
As recently as 20 years ago, scientists were unable to write down a mathematical equation for the obvious fact that mud does not cause rain. Even today, a top scientist may struggle to write an equation that distinguishes “mud causes rain” from “rain causes mud”. They can easily say mud is correlated with rain, and show that there’s a high probability of seeing mud if you see rain. However, expressing the simple causal concept — the kind of thing any child would know — is incredibly difficult.
Judea Pearl’s evangelism in recent years has shined light on how crucial causality becomes in the development of an AGI. Pearl’s The Book of Why opens with, “correlation does not imply causation” and “the sun will rise even if the rooster doesn’t crow.” AGI would need to possess a causal framework that could distinguish which events are dependent on one another. Pearl’s “Ladder of Causation” has three levels — the seeing (association), the doing (intervention), and the imagining (counterfactuals). Most animals and machines are on the first level, where they learn from observation; e.g. “How would seeing X change my belief in Y?”. Human babies are on the second level, where they experiment to learn the effects of interventions; e.g. ”What if I do X?” — though some animals like Caspurr, our MD’s cat, do approach the second level. Counterfactual learners are on the third level, where they imagine worlds that do not exist and infer reasons for observed phenomena; e.g. ”What if I had acted differently?”.
There are many examples that show observation is not the only basis for causation, and AIs that rely heavily on observational learning will still lack true understanding.
In the image above, the graph on the right shows raw data that would lead a machine to believe that exercising more increases cholesterol levels. However, adding the age grouping layer, represented by the image on the left, brings a level of understanding that refutes what the raw data implies. Within a particular age group, the more one exercises, the lower their cholesterol levels go.
In some circles, there is an almost religious school of thought that we can find answers to all questions in the data itself, assuming we are sufficiently strategic at data mining. While data mining is a critical first step, answering causal questions such as, “Is there a gene that causes lung cancer?,” can never be answered from data alone. A model of the process that generates the data must be formulated in order to truly interpret the data, and not merely observe or summarize it.
In recent years, more R&D dollars are moving into the study of causality. One of the latest papers released, by Leon Bottou and colleagues, is on Invariant Risk Minimization. This theory links causality to representation learning, a topic recently mentioned on LeCun’s podcast with Lex Fridman. The upshot is that true causes can be identified from data, i.e. disambiguated from spurious correlations, by finding out what patterns are invariant across different environments. This is needed specifically to generalize ML models to out of domain samples, and it matters greatly for global-scale autonomous vehicles and robots operating in an open, constantly evolving world.
In the future, when my Rosie the Robot turns on the vacuum cleaner while I am sleeping and I yell,“You shouldn’t have woken me up!,” I want her to understand that the act of vacuuming was ill-timed. I don’t want her to interpret it as an instruction to never vacuum, or never wake me up again. It should have a causal line of thought: vacuum cleaners make noise, humans sleep at night, noise wakes people up, and that makes most people unhappy. Counterfactuals will be very important in our daily conversations with robots every time we say something that begins with, “You shouldn’t have”. The strong AI pictured by Hollywood should be able to reflect on past actions, learn from past mistakes, and ultimately know when it should have acted different.
While a general causality framework is the ultimate goal, domain-specific systems could provide solutions in a range of fields. In eCommerce, consider the A/B testing of a website’s design. Currently, surveys are used that display different versions of a web page to determine which buttons and graphics cause users to click more often (consider Amazon.com’s transformation over the past two decades). These tests speak to the sensitivity of humans to factors like color and shape, and while they are not logically relevant to a buying decision, can have drastic impacts on consumer behavior. If a machine were able to understand this causal relation, it could change a website’s design to optimize revenue.
In healthcare, this could mean predicting when certain diseases will strike, and inferring which drugs or combination of drugs will give the patient the best opportunity to get well. Similarly, in health insurance, firms could change rates and premiums by understanding what causes certain accidents/injuries, and predicting them before they occur. In finance, a fraud detection system built around causality could recognize vulnerabilities and predict incidences with greater accuracy. Right now, there is software that can flag odd behavior, but there’s still a human staring at this flagged data for hours waiting for something fraudulent to happen. Building causality into fraud detection systems could save a ton on costs, and help predict incidents with higher accuracy.
Even if a general causality engine is a ways off, these task-specific AIs will be easier to produce, and still provide opportunity for significant innovation and real business use cases.
If you, or an entrepreneur you know, is working on causality or building a startup around it and is in need of funding, we want to hear from you! Feel free to reach out.