The Limits of Human Understanding
Let’s explore the meaning of “understanding” that gets thrown around liberally in cognitive science. Let’s look at two mechanical systems: proof systems and AlphaZero, which exist in a regime that is beyond our intuitive understanding of understanding.
The Four Color theorem in mathematics has a proof formulated by a computer-assisted proof system that humans find to lack any insight. Mathematicians call these kinds of proofs “Non-Surveyable”. That is infeasible for a human to verify (see: Non-surveyable proof )
The intuition as to why these non-surveyable proofs can’t be understood is the sheer number of steps involved in the proof. Too many steps that are infeasible for a human to follow and thus understand. This is despite humans being able to program the said steps!
There’s AlphaGo’s move 37. It is a move that can’t be explained with our knowledge of Go at the time, but only after seeing the game’s evolution that we accept that it’s a good move. I called this “incomprehensibility in the small.” (see: Why AlphaGo Zero is a Quantum Leap Forward in Deep Learning )
Like Fermat’s Last Theorem, an intuitive system has a hunch about truth, but we can’t know if it’s the truth until future proof. We can argue that AlphaGo conjures up its move 37 out of billions of mechanical perturbations that lead it to its conclusion. Each perturbation is unexplained, and thus the final decision is also unexplainable. Proof demands that every step has an explanation!
So we have two kinds of systems that are beyond human understanding. The needle in the haystack kind where we can’t be sure if there’s a single unexplainable step in the large proof and almost everything unexplainable proof that has an emergent effect of being true.
There exists an entire spectrum of partially explainable proofs between the two kinds of incomprehensibility. Somewhere in this spectrum, we rubber-stamp the idea that an explanation is “understandable.” Hence, what is the intuitive understanding of understandability?
Understandability is a consequence of having a language to explain a complex system. But here’s the thing that people may not be aware of. There are alternative ways of explanation that go beyond the Newtonian kind of explanation.
The Newtonian explanation works in the regime of linear systems where we can account for all of the interactions of a system, and we can all sum up cleanly into a final prediction. But not all systems exist in such a decoupled context.
There are several new kinds of explanatory frameworks that do have appeal. Constructor theory, semiotics, and category theory all seem to have a common feeling of being at the correct level of abstraction. That is if we work from the abstraction of incomprehensible processes we can compose them together in a manner that is comprehensible!
If we accept process metaphysics instead of the predominant substance metaphysics, then we must accept that numbers are actually processes and not real things (see: Classical and intuitionistic mathematical languages shape our understanding of time in physics ) In calculus, the symbol rewrite rules are discovered as a consequence of proofs of convergence of an infinite series of calculations. However, not all infinite series are known to converge. Furthermore, numbers like π become infinitesimally more precise the longer the computation.
But no universal algorithm can find proof that an infinite series will converge (i.e., halting problem). The lack of proof doesn’t imply that a series of computations does not converge. There are plenty of systems in reality that converge to recurrent patterns.
Wolfram calls this behavior “causal invariance.” Complex systems converge to emergent behavior, and we don’t have a good explanation as to why they do. There is an explanation, but it’s incomprehensibility to us. But all you need to do is to follow the steps!
Understanding is a consequence of the language we use to explain our observations. It’s often argued that present deep learning systems lack understanding. This is true if we frame understanding as an inability to explain using human experience and language.
This said, there exists a kind of understanding that is of the kind that is a challenge for the average human. This involves seeing a context employing multiple reference frames. Human cognition is often of the tunnel vision kind that only appreciates confirmation biases.
The “realness” of human perceptions is proportional to the number of modalities that are engaged. This is related but it is not the same as the “memorability” of the experience. What is remembered requires relevant difference.
We have no trouble distinguishing reality with our dreams. We also forget a majority of our dreams despite many being extremely novel. It is not novelty alone that aids recall.
If we frame modalities with different reference frames, we begin to see how understanding and memory are related. Understanding is being able to see the same thing in multiple reference frames. To see the sameness in different contexts. To remember is to understand.
Unfortunately, not all kinds of reference frames are used by human brains for retrieval. Analogies trigger how we retrieve our memories. We are not as well tuned to have complex abstractions trigger our memories. Our brain architecture isn’t designed for this!
Iconic signs trigger our memories. Abstractions are symbolic signs. It’s two levels of indirections deep, and hence memory retrieval requires more work. Natural language reduces this effort by embedding metaphors in language.
Peirce semiotics categorizes signs in a triad of icons, indexes, and symbols. But recall that symbols are a peculiar kind of index and an index is a peculiar kind of icon. Iconicity and its many forms (i.e., analogy, metaphor) are at the core of cognition.
Thus it becomes clear what modalities and reference frames imply with respect to cognition. There are ways of perceiving the world using similarity (i.e., in iconic fashion). We understand the world because we recognize similar experiences.
Most people can think dyadically because it’s damn easy to think simultaneously of two concepts no matter how divergent they are.
Triadic thinking is more difficult. It requires thinking of 3 concepts simultaneously and the 3 dyadic relationships.
However, quadratic thinking is next to improbable for most humans. This involves 4 concepts simultaneously, 4 triadic relationships and 6 dyadic relationships. 14 concepts in total. At best, humans are capable of attending to 3–5 items at a time.
Interestingly, with a dyadic form of 2 concepts, you have only 1 relationship. A triadic form has the same number of concepts and relationships. The quadratic form explodes into an intractable number of relationships.
Therefore any human cognition is a subset of a triadic form of thinking. Sequences are an example of a triadic relationship between the current, before, and after concepts. From sequences, you can construct up to the complexity of languages.
The kind of triadic thinking that Peirce was fond of involved an ordered relationship between 3 concepts. A firstness, secondeness and thirdness where each subsequent object depends on the previous one. It is a constructive triadic logic that parallels evolution.
In conventional conceptual modeling, concepts either have attributes or they have relationships with other concepts. It’s up to the modeler to decide to cast a relationship as an attribute or relationship. In Domain Driven Design, the rule is an attribute is a value w/o identity.
In Assemblage Theory, the parts of an assembly may have exteriority or interiority. That is, the parts can either stand on their own or are dependent on the whole. Parts that have exteriority are parts that play many roles in different contexts.
The complexity of human cognition is that our reasoning in rich in context. Hence Wittgenstein described human communication in terms of language-games. Wittgenstein argues that words do not have meaning outside of the language-game. Parts of language have exteriority.
The continuity of conversations is possible due to our human ability to connect thoughts together via analogy. Analogy-making compares two relationships, hence four concepts. Two concepts are similar if they are iconic, but two relationships are similar if they are analogous.
I conjecture that all cognition is a kind of sequence-based cognition. The difference between a fluent system (sequential language) and an empathy system is the difference between the triadic constraints imposed on each thought. The latter affords greater parallel processing.
The limitation of a deep learning system is that they stitch together only dyadic forms. That is, everything is a similarity operation (i.e. iconic) between two concepts. Large language models are fundamentally more powerful due to their indexical capabilities.
But it’s uncertain to me at this time if there are any architectural limitations with the combination of transformer and diffusion models. I don’t know if the building blocks of AGI are already here, and it’s just a matter of scale.
I say this because human cognition is confined to triadic forms of reasoning, and it’s unclear to me what is missing in current architectures. Said differently, if automation is can do a kind of triadic thinking, then it’s a situation that requires only incremental improvement.
I conjecture that any yet-to-be-invented deep learning will just be a variant of a triadic form of thinking. Quadratic thinking is too complex and unnecessary when triadic thinking is already compositional in nature.
Not many are aware that in physics that our analytical equations do not have a closed-form solution greater than 2-bodies or 2 dimensions (for fluids). It’s only through numerical simulations do we have access to approximate solutions. The same analogy can be made with cognition.
Human cognition is limited to processing triadic forms. We, however, have general intelligence because triadic forms are composable. The difference in cognitive processing is reflected by the consumed triadic form.
But it should be obvious that Deep Learning can’t achieve autonomy without the triadic form involving a selfhood model.