The 3 Traits of AI where Math Hits its Limits

Credit: Inception (2010)

There are 3 essential ingredients that are needed to understand intelligence and unfortunately present day mathematics has trouble tackling. Mathematics are tools that enhance our reasoning processes. Mathematics is a human language that we employ to derive understanding of reality. However, this language are not all powerful and does have limitations. We explore some of these limitations here with respect to areas important to AI.

Although mathematics tends to get developed way ahead of its time, there are many times that the application of a different kind of mathematics to a new domain leads to breakthroughs. Richard Feynman, for example, employed century-old path integrals mathematics to gain new insight on developing Quantum Electrodynamics. There are however plenty of limitations in mathematics and this article addresses those limitations with respect to our ability to comprehend essential ingredients of cognition.

The “Quasi-empiricism” of math is not a new idea. Mathematics is a human language that we employ to describe our reality. Quoted from the Wikipedia article [WIKI-1]:

Eugene Wigner (1960) noted that this culture need not be restricted to mathematics, physics, or even humans. He stated further that “The miracle of the appropriateness of the language of mathematics for the formulation of the laws of physics is a wonderful gift which we neither understand nor deserve. We should be grateful for it and hope that it will remain valid in future research and that it will extend, for better or for worse, to our pleasure.

The first ingredient is the notion of time.

Time is a difficult concept to grasp. I guess the easiest way to handle it is just do what Einstein did. Just treat it as another dimension.

Most physics is invariant in time. Meaning, you can move forward or backward in time and the physics are identical. However, at the macro-world we don’t see it that way, time exists because entropy exists. The arrow of time follows that of increasing entropy.

In fact, there’s really no notion of memory without having to consider the existence of time.

Most mathematics don’t have a concept of memory. Memory is the equivalent of having state and almost all mathematics involves functional constructs that are stateless. Functional programming follows a single assignment rule where once any variable is set, it remains set to that state, never changing. It is this constraint that makes the use of functional programming something that is easily parallelizable. It is a convenient constraint that allows our mathematics to be analyzable.

We cannot, however, avoid time, because that’s where the dynamics come from. The only context that mathematics is helpful in dynamics is in the context where there is no memory exits. Introduce memory or introduce state, then all bets are off! The best that mathematics can do is to quantify the boundaries of computation and not predict its final behavior (see: [WIKI-2].

The only dynamics that is analyzable by math are equilibrium states. We can only make statements about states that are in equilibrium. What happens in between, that is computation, can only be, at best, be simulated. Equilibrium is the state when we assume that time is at infinity. An unrealistic assumption, but an assumption that is brought about by convenience.

There is also the notion of asynchrony that is such a beast in complexity. That is, when different parallel processes are not in lockstep synchrony. All our digital circuitry require lockstep synchrony in the form of a common clock that drives behavior. The biological brain does not have a common clock, it works in a regime of asynchrony.

The second ingredient is the notion of collective emergent behavior.

Robert Robert Sapolsky has a short lecture on Youtube [SAP] (“Thinking about emergence and chaos”) that brings about the point about bottom-up behavior (Special thanks to Felix Hovsepian for pointing put this video). He says that “most of the stuff that he and his peers do is reductive stuff that is very limited.”

Intelligence comes from the emergent behavior that arises from the collective behavior of millions or billions of interacting components. This is the very essence of the concept of Connectionist AI. The components themselves do not have to be constructed in a complex manner and can be very simple and in fact be all uniform. Artificial Neural Networks and Deep Learning spring from this very idea of deriving intelligence from simple components called ‘neurons’. It is important to remind oneself that the neurons in ANN are a cartoonish version of a biological neuron. However, it is not the precise construction of the neuron that is important, but rather it is the collective behavior that is important.

That is why the reasoning that ANN and DL should be rejected because they are not biologically plausible is a very bad argument. It is entirely conceivable that intelligence can be arrived with very different kinds of ‘neurons’. That’s because, there’s some fundamental capability that a neuron performs (i.e. information dynamics, meaning computation, memory, and signaling) that is all that is needed, however, the connectivity is where intelligence emerges.

The third ingredient is the notion of meta-level reasoning.

This is the most difficult to grasp idea and it may, in fact, be the reason why ‘consciousness’ exists. We can understand the idea of building up ideas by the composition of more primitive ideas. We can understand this because that is how language is constructed. That is, from letters to syllables to words to sentences to paragraphs etc.

We also know of meta-level reasoning. It’s one of those ideas that’s hard to explain to novice programmers, but it exists in many programming languages. That is, you have programs that operate on the building blocks of the language itself. It leads to very expressive and short source code. Experienced programmers have no difficulty working at the meta-level. However, these kinds of system are extremely difficult to debug.

However, it doesn’t stop with just one level of meta-reasoning. You could have meta-meta level constructs ad infinitum. I’ve encountered this idea in the wild in the modeling language UML. There’s a concept of meta-metamodels, here’s the definition:

A metamodel or surrogate model is a model of a model, and metamodeling is the process of generating such metamodels.

Which, it just occurs to me, is the most universal definition of “Generalization”.

This lecture by James Crutchfield on “The Complexity of Simplicity” gives a very good sense of the enormous gap that we have between our math its ability to analyze complex systems:

Post Commentary

I’ve received the wrong impression with this article that I’m implying that mathematics is not needed. On the contrary, it is absolutely necessary. However, I am also banging the table for those who can’t see that present day mathematics has its limitations. There are many who continue to stick to 18th-century Bayesian logic and corresponding mathematics and have an unsubstantiated belief that it is actually even going to work in this new domain.

There are plenty of times where I see researchers attempt to cast DL systems in terms of ‘equivalent’ Bayesian networks in the hope that placing a round peg into a square hole will actually work. Well, it’ll work if the round peg’s diameter is smaller than the whole width of the square. But it is obvious that it wouldn’t be a great fit. There is absolutely no evidence that the reductionist logic is going to work in a domain of collective emergent behavior. If you approached a room of statistical physicists about using Bayesian inference, then you likely will be thrown out of the room in ridicule. Let’s all get real folks!

The Deep Learning AI Playbook: Strategy for Disruptive Artificial Intelligence

Further Reading