Open-universe probabilistic models
Probabilistic models are widely used in artificial intelligence and machine learning. When using these models, we begin by stipulating a distribution on a set of random variables, the latter standing for the observable or latent atoms of our application domain. However, more often than not, we find ourselves in the position of having to deal with the inescapable fact that our knowledge of the world is incomplete! For example, we cannot know how many individuals we will encounter in our lifetime, and we certainly cannot be expected to know the identity and traits of these individuals as and when we encounter them. In other words, there is uncertainty in how many things there are in this world, and what these things are.
Yet, in AI, we need to be prepared to act and react in situations precisely of this sort. Think of a museum robot that meets and greets its visitors, most of whom may never have met the robot before their visit.
Part of why open universe probabilistic models are exciting is because they attempt to tackle this complication head-on in a principled manner, that is, in a manner that respects the axioms for updating priors. (This can be contrasted to, for example, beginning with the assumption that an atom p does not exist, and then changing that stance when we now observe p.)
A question I have been interested in is this: under what conditions can we provide effective inference algorithms for such probabilistic models? The answer seems to depend on how we specify the model and what sort of queries we are ultimately interested in asking. I attempted to cover this landscape in a recent talk at the Amsterstam Machine Learning Lab:
A long-standing goal in AI has been to mimic the natural ability of human beings to infer things about sensory inputs and unforeseen data, usually involving a combination of logical and probabilistic reasoning. The last 10 years of research in statistical relational models have demonstrated how one can successfully borrow syntactic devices from first-order logic to define large graphical models over complex interacting random variables, classes, hierarchies, dependencies and constraints. Statistical relational models continue to be widely used for learning in large-scale knowledge bases, probabilistic configurations, natural language processing, question answering, probabilistic programming and automated planning.
While this progress has been significant, there are some fundamental limitations in the expressivity of these models. Statistical relational models make the finite domain assumption: given a clause such as “friends of smokers are smokers themselves”, the set of friends and those who smoke is assumed to be finite and known. It then makes it difficult to talk about unknown atoms and values (e.g., “All of John’s friends are worth more than a million”), categorical assumptions (e.g., “every animal eats”) and identity uncertainty (“James’ partner wore a red shawl”). Currently, approaches often simply ignore this issue, or deal with it in ad hoc ways.
In this work, we attempt to study this systematically. We begin with first-order probabilistic relational models. But now, we allow quantifiers to range over infinite sets, and although that makes matters undecidable in general, we show when limited to certain classes of statements, probabilistic reasoning becomes computable with attractive properties (e.g., satisfies the additive and equivalence axioms of probability in a first-order setting).