AI Cannot Ignore Symbolic Logic, and Here’s Why

Walid Saba, PhD
ONTOLOGIK
Published in
7 min readDec 29, 2021

--

edited 1/2/2022

Generalization and High-Level Representations

Yoshua Bengio, a prominent machine learning (ML) researcher who is also considered to be one of the godfathers of deep learning (DL) has recently been quoted as saying:

Deep learning, as it is now, has made huge progress in perception, but it hasn’t delivered yet on systems that can discover high-level representations — the kind of concepts we use in language. Humans are able to use those high-level concepts to generalize in powerful ways. That’s something that even babies can do, but machine learning is very bad at. (here)

Bengio is correct in his diagnosis. Of the famous trio (Geoff Hinton, Yoshua Bengio and Yann LeCun), Bengio has actually been more open to discuss the limitations of DL (as opposed to, for example, Hinton’s “very soon deep learning will be able to do anything”). But Bengio still insists that the DL paradigm can eventually perform high-level reasoning without resorting to symbolic and logical reasoning.

The problem DL extremists have in admitting symbolic and logical reasoning (S&LR) is that this will make neural networks just one of the tools that can be used at low level tasks such as perception and pattern recognition, while S&LR will have the supremacy of modeling human-like high-level reasoning, the kind that we use in complex problem solving tasks, language understanding, and so on. So anyone who sees the mind as just a neural network will never admit the need for symbolic and logical reasoning (S&LR). But if they don’t accept this fact now, eventually they will — if, that is, they acknowledge the need for high-level representation of abstract concepts that allow us to “generalize in powerful ways”, which clearly Bengio does acknowledge.

But There’s No Learning of Basic Facts without Logical Generalizations

In a previous article I discussed the possibility of learning top-down (as opposed to bottom-up from data): by instantiating innate (metaphysical) templates since otherwise it would be hard to explain how children come to know basic naïve commonsense physics early on without having the time to learn these templates from data. As one example I used the LocatedIn template. The logic of this template can best be described by the picture below (which is repeated from the previous article).

Because his baseball glove is in his briefcase, Tom knows that if he puts the briefcase in his mother’s SUV, the location of his baseball glove is also the location of his mother’s SUV. He also knows that if they drove to Woodstock, NY, then his baseball glove would now be in Woodstock, NY, etc.

As we argued before, it is very troubling to suppose that a child comes to learn the logic of the above template bottom-up from observations/data. There are many technical reasons why this a troubling supposition, but we point here to two very critical issues with this bottom-up/data-driven learning of these (commonsense) naïve physics templates: (i) this thesis runs into a circularity problem since in learning the logic of the LocatedIn template one would need to have already learned the logic of (for example) the ContainedIn template, which might in turn recursively assume already knowing the LocatedIn template; and (ii) if these templates are learned bottom-up and thus individually, there is a chance that we could learn them differently, but since we are not allowed to learn them differently, they could not be the result of individual observations (or experiences).

But leaving aside for now all the issues/problems with an “everything is learned bottom-up from data” viewpoint, I want to concentrate here on the logic of this simple template. Clearly, a child learns this template not at the data/instance level (baseball glove/briefcase; briefcase/SUV, and then SUV/NY state, …) but at a much higher-level; namely physicalObject-containedIn-physicalObject. Nothing else can explain how fast a child becomes aware of the logic of this template (if a child learns this from the observations of instances, they would spend a lifetime to learn just this basic commonsense physical fact).

How then is the logic of such a template learned? One suggestion is that it is learned top-down and not bottom-up: there are physical templates that describe the world we live in and all that is needed for a child to “learn” these commonsense facts is to instantiate some innate templates a few times. The logic of the above specific template can be described as:

As another simple example, a child also quickly learns that painting some chair, say chair1, red results in the fact Color(chair1, red). But once this template is instantiated, a child does not have to see a white painting of door2 to know that that would result in Color(door2, white) or to see the yellow painting of car100 to know that that would result in Color(car100, yellow). The child learns a quantified generalization over higher-level symbols that represent concepts:

It is just not at all plausible that a child learns the above template bottom-up from data by seeing a large number of different objects painted with different colors. In absence of another plausible explanation of how a child comes to “learn” these commonsense physical facts so quickly, we believe that the logic of such universal templates is innate and that a child learns these facts quickly in a top-down fashion by few instantiations of the template. If that story is accepted, then the only way this could happen is by having quantification over symbols — symbols that in turn represent whole classes and not specific instances (e.g., Location, PhysicalObj, Color, etc.) In the absence of quantification and symbolic logic there seems to be no plausible explanation of how this type of generalization and learning can happen in a bottom-up/data-driven approach.

And There’s No Learning of Language without Logical Generalizations

Besides commonsense metaphysical facts, quantification over symbols that represent higher-level concepts is the only plausible explanation of how a child comes to learn language. When a child hears (and understands) sentences such as

The child will then know that any Human can be the agent of love, and the object of their love could be any Entity. Thus, a child will come to understand or generate any sentence that has the following structure:

This is a template for a potentially infinite number of sentences since the agent of love could be any Human and the object of the love could be any entity (a human, such as Mary or the boy next door; or an Activity like playing guitar!) Thus, what a child learns is a rule that quantifies over symbols of a certain type/class. The following are some instantiations of the above template:

Again, the only plausible explanation of how a child very quickly learns these templates suggests that the child “masters” a high-level template that can only be properly instantiated if it were a template defined by quantification over symbols that represent high-level concepts (or — loosely speaking, symbols that are of a specific type).

Bengio and the Quest for an AI that can Generalize with High-Level Representations

Bengio is correct that “Humans are able to use those high-level concepts to generalize in powerful ways” and he is also correct that ML/DL “hasn’t delivered yet on systems that can discover high-level representations — the kind of concepts we use in language.” But he is wrong that such generalizations can (ever) be obtained if we stick to the bottom-up/data-driven paradigm excluding any other. He is also wrong that we can do the kind of generalizations and conceptualizations that children master early on, without admitting logical quantification over symbols — symbols that range over/represent high-level concepts — note also that some symbols could in turn be defined with complex syntactic structure. Consider the following:

(1) “the tall boy that usually comes here with a black AC/DC cap”
(2) “John”

While (2) is a complex noun phrase, (1) and (2) are both, semantically speaking, objects of the same type, namely Human!

Data is important. And finding important correlations in the data is also important, useful and has plenty of useful applications. But cognition, and specifically human-level cognition, is a lot more then observing some pattern in the data. Not withstanding all the misguided hype and all the media frenzy, no plausible theory has so far been able to demonstrate that the kind of high-level reasoning humans are capable of doing can escape symbolic reasoning.

Incidentally, I have not said anything in this article that was not already observed, argued for, and proven, since at least the early 1980’s. Reading Jerry Fodor would be a good start — and so, perhaps it is time to re-read AI and Cognitive Science with an open mind?

A great reference on how to formalize commonsense metaphysics by Andrew Gordon and Jerry Hobbs

____
https://medium.com/ontologik

ONTOLOGIK

Countering Misguided AI-Hype and presenting scientific…

Following

--

--