Toward Grand Unified AGI

Exploring the integration of multiple AI algorithms within a Unified Rule Engine as a route to robust Meta-Learning.

Ben Goertzel
SingularityNET
12 min readAug 24, 2018

--

From Here to AGI via A Few Concrete Steps

To get from where we are now to AI systems with general intelligence at the human level, and beyond, advances on multiple levels will be needed; at the very least:

  1. More effective and fluid interconnection of different AIs into processing networks (which is the core thing the SingularityNET technical design solves).
  2. Incentivising of more developers to work on AGI rather than purely on AI that serves the business models of large tech companies and military/intel agencies (again something SingularityNET addresses from an economic/organizational perspective).
  3. Rich interoperation of sensory, motoric and cognitive subsystems, so that AIs can learn better from the natural and human world (SingularityNET is working on this via its collaboration with Hanson Robotics).
  4. Effective meta-learning — general-purpose learning algorithms that can learn how to learn better, learn how to best carry out learning in particular contexts, learn how to learn and also learn how to learn how to learn.

In this blog post, I am going to unfold some reasonably technical ideas pertinent most directly to the fourth point in the list: How to make meta-learning work in reality, in the context of a complex multi-algorithm cognitive architecture carrying out a variety of complicated tasks.

Dr. Nil Geisweiller has recently written a research blog post describing his current work on “probabilistic inference meta-learning.” In his research, he discusses using OpenCog’s Probabilistic Logic Networks (PLN) framework as the base-level algorithm for meta-learning, via using pattern-mining and then PLN itself to learn patterns in large sets of PLN inference examples, to learn what sorts of inferences work better in what contexts. This gets at the crux of the meta-learning problem in an OpenCog context; it is about using PLN to help PLN learn how to reason better.

This blog post is complementary to Dr. Nil’s, in this post I am going to describe some work currently underway to, in effect, fuse various learning/reasoning algorithms now working separately within OpenCog so that they appear as aspects of a single unified learning/reasoning algorithm.

This sort of unification provides greater elegance than a situation where there are multiple markedly distinct learning/reasoning algorithms. However, that is not what I want to highlight in this blog post.

My main point is that a more unified learning/reasoning algorithm will be more susceptible to powerful meta-learning because it will make it easier to learn patterns in “how to learn and reason” that span multiple aspects of learning/reasoning.

For instance, if procedural learning and declarative reasoning are implemented as entirely different algorithms, it is harder to learn patterns about “how to learn” that span both of these. However, if they are implemented as different aspects of the same unified meta-algorithm, it is easier to learn such patterns (though they are still not necessarily easy to learn).

This line of thinking is the inspiration for a number of different research and implementation projects going on within the OpenCog/SingularityNET project currently, some of which I am going to describe here. Therefore, rather than presenting new software work, this post elaborates on the unifying thread and principle behind many different software initiatives currently underway.

Forward and Backward Growth/Chaining Processes

The core conceptual idea of the “unified learning/reasoning meta-algorithm” I am going to talk about in this post was described in a not-very-formal essay I wrote in 2006.

The basic idea is just that: An awful lot of cognitive processes can be represented either as iterative “forward-going” growth processes (in which a set of growth operations are repeatedly and recursively applied) or “reverse engineering” growth processes, in which the starting point is a target, and the goal is to figure out how to build that target by repeatedly/recursively applying a set of growth processes. Philosophically one could say that cognitive processes mostly involve growth running either forward or backward in time. There are also various more or less subtle combinations of forward and backward going growth processes, which are seen in various types of cognition and algorithmic processing.

This philosophical/cognitive conception became expressed in the OpenCog architecture a few years ago via the creation of the Unified Rule Engine (URE), an abstract rule engine that enables the construction of very flexible logical rule systems representing various types of forward or backward growth processes. In this approach, forward and backward chaining in logic are presented as special cases of more general forward and backward “rewrite rule application” processes. The rules involved are rules that enact transformations on the Atomspace, OpenCog’s weighted, labeled hypergraph knowledge store. Since the URE was created (mostly by Nil Geisweiller), an increasing number of AI processes operating within OpenCog have been implemented or re-implemented in terms of the URE.

Those who have been around the AI field a long time may associate the notion of a “rule engine” with the Good Old Fashioned AI (GOFAI) systems of the last century, in which an attempt was made to achieve intelligence via relatively simple inference processes acting on hand-coded knowledge bases. These systems had impressive and important successes in some limited domains, but it is now generally acknowledged that this was a research dead end.

The mathematical notion of a “rule system,” however, is much more general than what was done in these GOFAI systems. For instance, a neural net architecture can easily be implemented as a rule system, where the rules do things like update neural weights and activations, or connections. Similarly, an evolutionary learning system can easily be implemented as a rule system, where the rules govern mutation, crossover, and selection, etc.

The URE in OpenCog exploits the breadth and flexibility of the mathematics of iterative forward and backward rule application in ways that GOFAI systems did not.

The Breadth of Rule Engine Applications in OpenCog Today

OpenCog’s PLN (Probabilistic Logic Networks) uncertain logic engine matches extremely naturally with the URE (as logic is a form of rule system). The frameworks used to map syntactic sentence parses into sets of semantic relationships (RelEx2Logic and LinkParse2Lojban) also fit very naturally into the URE framework.

The OpenCog Pattern Miner, which in an earlier version was written as reasonably complex C++ code acting directly on the Atomspace, has recently been re-implemented using the URE. This is a somewhat subtle use of the URE: As the pattern mining process proceeds, a population of candidate patterns is “grown,” step by step. The rules involved are such that they take candidate patterns and extend them into larger candidate patterns. This is repeated over and over, and any pattern that seems not significant enough (in terms of frequency, surprisingness or whatever other criterion is at hand) is removed from the population. The rule-based growth of patterns here is conceptually similar to the rule-based growth of plants one sees when simulating plant growth using a rewrite rule system such as a Lindenmayer system.

One advantage possessed by the OpenCog Pattern Miner over the previous version is the capability to combine pattern mining with lightweight logical inference in various ways. Instead of just looking for patterns that exactly fulfill a specific quality function (e.g., frequency or surprisingness), one can look for patterns that fulfill the quality function after certain simple logical transformations are carried out. The URE can freely interweave the growth rules used in doing pattern mining, and logical transformation rules such as those used in PLN (or other ones).

The rules governing motivated action, rules of the conceptual form “(Context AND Procedure) IMPLIES Goal,” within the OpenPsi module, have been implemented this way as well, bringing the URE into the domain of real-time interactivity.

The Psi-dynamics module carries out real-time updating of emotion and motivation related variables associated with OpenCog-based control of embodied agents (such as the Sophia robot from Hanson Robotics), such as variables for emotional valence and arousal and urgency. This represents a use of the URE for implementing a (frequently updated) time-discrete, continuous-state nonlinear dynamical system.

With a similar approach, the URE could be used to implement other sorts of nonlinear dynamical systems, e.g., simulations of biological systems. This is not being done currently but is part of the future plans for the utilization of OpenCog within the Mozi bio-AI project (see some relevant OpenCog code here). The underlying vision is that to solve hard biomedical problems like human aging, it will be necessary to tightly integrate data-driven machine learning, natural language information extraction from research papers, probabilistic reasoning that integrates the results of machine learning data analysis and natural language processing with knowledge from bio-ontologies — AND simulation of biological systems at various levels, to various degrees of precision.

If the simulation modeling, the reasoning, the ML and the NLP are all done using the same URE framework, then interoperating them becomes much easier — both implementationally and conceptually — than if they were all implemented in different ways.

Another research thread currently in play, at the St. Petersburg SingularityNET office led by Dr. Alexey Potapov, regards the integration of deep neural networks (doing, for instance, vision processing, or analysis of robot movement data) with OpenCog’s symbolic reasoning (such as the Pattern Matcher, Pattern Miner or PLN). One way to implement this would be to use the URE to apply rules that enact updates on neural networks, and this is currently being explored.

AtomSpace-MOSES

A new URE application that I am now somewhat involved with is the porting of the MOSES automated program learning engine into the Atomspace. Currently, MOSES (Meta-Optimizing Semantic Evolutionary Search) is considered part of OpenCog, but unlike the other pieces of OpenCog, it does not use the Atomspace as a representation. Rather, it uses a separate, tree-based representation of programs, and then applies a special sort of evolutionary learning to these programs. Once they are learned, the programs can then be imported into the Atomspace for follow-on analysis, e.g., using the Pattern Matcher or Pattern Miner or PLN or other methods.

The Atomspace-MOSES project (AS-MOSES), currently being carried out for SingularityNET by a team at iCog Labs in Addis Ababa, under the supervision of Dr. Nil Geisweiller, aims at bringing MOSES into the Atomspace. In the AS-MOSES approach, the basic evolutionary program learning algorithm comprising MOSES remains basically the same, but programs are represented as Atom graphs, just like other sorts of procedures that exist in the Atomspace.

MOSES involves a stage in which programs are normalized in a standardized, hierarchical “Elegant Normal Form” — this is very natural for the URE to handle. Porting MOSES program trees into Atomspace allows us to eliminate the separate Reduct program normalization library currently used, and replace it with the application of the URE.

The process by which MOSES creates a new program, to fulfill whatever fitness function it is trying to optimize, is a growth process similar to what happens inside the Pattern Miner. A small program tree is created, and then is extended incrementally by adding new branches and nodes to it — at each step testing the impact the extension has on the degree to which the program satisfies the fitness function. This process of program tree extension can be implemented via a special set of program-tree growth rules.

The original vision of MOSES involved a large role for probabilistic modeling of a population of program trees, during the program-evolution process. In the current MOSES implementation, probabilistic population modeling plays a smaller role, because the types of simplistic probabilistic program tree modeling that were straightforward to implement in the standalone MOSES codebase proved to be of limited effectiveness. Bringing MOSES into the Atomspace allows easy experimentation with PLN logic and pattern mining as methods for probabilistically modeling sets of programs being evolved toward fulfillment of a fitness function.

It is expected this will lead to new kinds of fusion between evolutionary learning and probabilistic logic. This should be valuable for many different applications, including for instance scientific hypothesis generation (where the evolutionary learning aspect is good at generating creative new hypotheses, and the logical reasoning aspect is good at filtering out hypotheses and guiding hypothesis formation based on scientific background knowledge). Artistic creativity may be approached similarly, with evolutionary learning coming up with wacky new ideas and probabilistic inference providing guidance via estimating the degree of appreciation of each idea by the audience in question. Physical design may also be approached in the same way, where MOSES’s evolutionary design creativity can be combined with a logic engine’s ability to apply the laws of physics (or chemistry, in a nano-design context).

The Promise of Unified Implementation for Meta-Learning

Implementing various AI algorithms using a common mathematical and software framework is not just an elegant thing to do, it represents a new way of thinking about the OpenCog AGI approach.

One of the significant distinctions between OpenCog and the Webmind AI Engine that my colleagues and I built in the late 1990’s is: Webmind consisted of an entirely heterogeneous set of AI algorithms acting on a common graph knowledge representation, whereas OpenCog consists of a set of AI algorithms specifically configured to work together effectively, in the sense of understanding each other’s intermediate representations and being able to help each other avoid combinatorial explosions (this has been called “cognitive synergy”).

The implementation of OpenCog’s various AI algorithms in terms of the URE represents a next step: A set of AI algorithms, specifically configured to work together cognitive-synergetically on a common weighted-labeled-hypergraph knowledge representation, and all operating as forward and/or backward growth processes using the same abstract rule application framework.

Now consider what this means for meta-learning. In URE meta-learning, one searches for patterns of (sequential or parallel) rule application, which are correlated with effectiveness according to whatever the relevant goals are. In logic, one is searching for sequences of logical rules that tend to lead to correct, high-confidence conclusions in particular contexts. In semantic interpretation, one is searching for sets of syntactico-semantic transformation rules that are generally effective in producing the correct interpretation of a sentence. In evolutionary program learning, one is searching for sequences of program-tree mutations that are surprisingly likely to yield an improved program (in a particular context or for a particular sort of program).

Cognitive synergy, in the simplest sense, involves solving cognitive problems via doing a few steps of one AI algorithm to get an intermediate result, then (concurrently or in parallel) a few steps of another AI algorithm acting on that intermediate result and producing another intermediate result, then (concurrently or in parallel) a few steps of another AI algorithm acting on that new intermediate result and producing one more intermediate result, etc.

Now, suppose all of these AI algorithms are operating using the URE. It then becomes possible to look for sequences of rule-applications that span multiple AI algorithms. One can look for patterns of the form “Apply these two types of mutation operations on a program tree, then apply these sorts of inference rules to combine the program tree with relatively crisp knowledge from an ontology, then apply pattern mining with these sorts of logical transformation rules to look for patterns in the results.”

Of course, one could write code to look for patterns spanning the internal operations of multiple AI algorithms anyway, even if the various AI algorithms were not all implemented using the same core mechanisms and dynamics. However, the ease of conceptualizing, writing and experimenting with this sort of code is much greater given a unifying framework spanning all the AI algorithms in question, both conceptually and mathematically and at the implementation level. This is one thing the URE gives us.

Along the way, there is also the possibility of finding patterns of effective rule application that exist within multiple seemingly different cognitive algorithms or doing transfer learning to apply patterns found via studying one AI algorithm’s rule application to ease the finding of related patterns in another AI algorithm’s rule application.

While somewhat complicated to explain in natural language, once absorbed this approach begins to seem almost obvious. Thus, it is striking to realize that no project on the planet besides OpenCog and SingularityNET is intensively exploring this direction. Most AI research involves applying one algorithm in a separate and monolithic way. Most work on multi-algorithm cognitive architectures avoids the use of multiple powerful, scalable learning algorithms — because making such algorithms work together is a pain, placing special requirements on a cognitive architecture.

SingularityNET networks together different AI algorithms in a loosely connected way — generally speaking, two distinct AI agents in SingularityNET will exchange data and requests and results, but they will not expose their intermediate states to each other. SingularityNET could support agents that openly and intensively share intermediate states with each other, but this won’t be the most common case — and this is fine, because different AI algorithms written by different people for different purposes, can’t be expected to understand each others’ intermediate states, let alone draw useful conclusions from them.

On the other hand, if one DOES have algorithms that can comprehend each others’ intermediate states — and whose intermediate states are similarly structured enough to support cross-algorithm intermediate-state pattern mining and reasoning — then one has the potential for a higher degree of cognitive synergy. This is what we are aiming at within OpenCog, via finding elegant and simple ways to implement our various cognitive algorithms using the URE.

How Can You Get Involved?

Please visit our Community Forum if you would like to discuss this post with other members of the SingularityNET community. Over the coming weeks, we hope to not only provide you with more insider access to SingularityNET’s groundbreaking AI research but also to share with you the specifics of our development.

For any additional information, please refer to our roadmaps and subscribe to our newsletter to stay informed regarding all of our developments.

--

--