12 Blind Spots in AI Research
Humans by their nature have many cognitive biases. This can become detrimental to real scientific progress. Research tends to be biased in favor of approaches that many experts have invested countless years of study. The consequence of this is that we ignore many intrinsic characteristics found in the very system under study. Thus, researchers can unfortunately consume a lifetime pursuing a wrong and pointless path. History is littered with research that in hindsight were discovered to be incorrect and therefore worthless.
Sabine Hossenfelder has written about this cognitive bias in her field of high energy physics. Her book “Lost in Math” explores the cognitive biases found in a group of the most intellectually gifted scientists on the planet. Despite the surplus of cognitive capability, Hossenfelder argues that theoretical physicists’ bias that is in relentless pursuit of beauty has led to elegant mathematics but also wrong science. Physics has not seen a major breakthrough in over four decades. The author argues that scientific-objectivity has been compromised for the sake of aesthetic approaches.
In this post, I will explore similar biases in our quest for Artificial Intelligence. AI, or the expanded term Artificial General Intelligence (AGI). One of the more prominent biases is the employment of Bayes’ Theorem. Bayes’ Theorem is motivated by Occam’s Razor:
“The explanation requiring the fewest assumptions is most likely to be correct.”
This is further justified by Popper’s falsifiability criteria. Simpler explanations leads to easier or feasible testability. This has lead to the following belief that many scientists have devoted their careers to:
“Everything should be made as simple as possible, but not simpler.”
These are good rules of thumb for humans to comprehend the world, however, this is precisely the kind of appeal to aesthetics that Hossenfelder has argued against. Breakthroughs in science will always be based on the gut feelings of scientists, however, these assumptions should never lead to a dogmatic application of these hunches. To quote Richard Feynman:
It doesn’t matter how beautiful your theory is, it doesn’t matter how smart you are. If it doesn’t agree with experiment, it’s wrong.
I now bring together 12 facts that a majority of AI/AGI researchers have continued to ignore.
Cognitive psychologist have established for a while now that human cognition operates employing two distinct kinds of cognition. This is known as Dual Process Theory where an intuitive and rational system are in coordination and competition with each other in human cognition. The failure of GOFAI can be traced to the belief that human cognition is based primarily on a rational cognitive system despite all evidence to the contrary. The intuitive cognitive system is the intrinsic engine that drives all of human cognition. Human cognition pervasively employs ‘amortized inference’.
Nature requires that human’s spend a considerable fraction of their time sleeping. Sleep deprivation leads to cognitive failure and eventually death. So at the very least, there is empirical evidence that sleep is important for cognitive development. Unfortunately, most research ignores the importance of off-line cognitive development. Human capability to develop abstractions are likely driven by the processes we find while in sleep.
3. Incomplete Representations
There is a long tradition of scientists to seek out the existence or the generation of internal mental representation that mirrors the external world. So Good Old Fashioned AI (GOFAI) builds symbolic models that reflect the semantics of the external world. Connectionists Deep Learning trains networks to learn an internal representation of the world. This internal representation reflects the semantics of observations in a continuous high-dimensional space. This space permits operations such as similarity comparisons to reason.
There are two problems, the first problem is that internal representations push the problem of understanding just down another level. Mapping external reality to an internal representation still requires something to understand the internal representation. The second problem is that we are attentive to only a small subset of what we can observe. Cognitive blindness exists and unlike machines, that are able to capture a photographic representation, we attend only to a visual subset and imagine the rest. James Gibson in his ecological theory of perception describes a different kind of representation, one that is based on affordances. That is, brains perceive only what is possible in the environment and not an exact representation of the environment.
How do we recognize an object by just touching it? How are we able to recognize an apple inside a brown bag, just by touch? Perception apparently isn’t based on single snapshots of reality, but rather multiple snapshots of attention that are stitched together into a consistent whole. Humans have an unusually dextrous hand that perhaps gives rise to our own advanced cognitive capabilities. We have better representations of the world because we have richer interactions with the world and are better integrators of disparate attentive information. Study into this cognitive exploratory and integrative mechanism is largely absent in today’s research.
5. Bounded Rationality
Humans have limited rationality. Most people cannot attend to more than five items at a time, furthermore almost all cannot go deeper than three levels of recursion. Yet, many proposals for AGI demands unbounded rationality (i.e. AIXI, Solomonoff induction, symbolic computations, probabilistic reasoning). The point here is that simple heuristics may be all that drives human thinking and we don’t have to appeal to elegant mathematical approaches. Researchers have a motivation to exploit the mathematical tools in their arsenal. This results in either over complicating the problem or using the wrong tools for the problem of human cognition. Kolmogorov complexity is an interesting formulation of a complexity measure, however, this measure is unlikely to be relevant to bounded rational human-complete intelligence.
6. Descriptive vs Generative Models
Daniel Dennett describes evolution and Turing machines as “competence without comprehension”. Many complex processes can achieve considerable capability without an understanding of their own actions. This of course shouldn’t be a surprise. However, it befuddles many who cannot comprehend how complex design can arise from non-intelligence. I suspect this is due to the human cognitive bias to seek out order and thus an underlying design.
Mathematical and logical models of reality are all descriptive models. What this means is that scientists recognize order in the world in the form of invariant laws and these are used as compressed descriptions of reality. However, descriptive models are not intrinsically part of reality, they are the observed order that emerges from complex generative behavior.
Humans seek to recognize order even if there is none. We can look at a cloud and see the Easter bunny. The dynamics of the cloud has no intentionality to generate an Easter bunny, it is our subjective observation of the world that sees order. What this implies is that descriptive models of the world at best characterize the behavior of reality, but are not the same as a generative model. Therefore, descriptive models (i.e. probabilistic models) that drive simulations are at best simulations and not the same as generative models. Furthermore, the halting problem implies that discovering the hidden generative model through observational descriptive models may be unrealistic.
7. Emergent Far from Equilibrium Processes
Where does generative (not descriptive) order originate from? The second law of thermodynamics implies a tendency towards disorder. Disorder here being a measure (i.e. Entropy) that describes our inability to assign descriptive order in our observations. Ilya Prigogine showed that order arises from far from equilibrium processes. The problem with current machine learning methodology is that the mathematics always assumes some kind of equilibrium process.
However, we need to go beyond just self-organizing functionality. Cognitive capabilities bootstraps itself through a developmental program that layers one emergent capability on top of other emergent capabilities. The mathematical problem with emergence is that it can’t be pre-computed. This is because each new emergent capability expands the space of possibilities and thus something becomes known when it was previously unknown. An intuitive depiction of this can be seen in human technological innovation. Who could have imagined Google prior to the invention of the World Wide Web?
One critical piece of information that seems to be often ignored in cognitive models is the information about self. All biological organisms react in relation to a primitive notion of self. That is, their behavior is driven by their purpose to survive and replicate. All of biology is driven by this notion of preserving the boundaries of a self.
So when we begin to explore higher cognitive beings like ourselves, we must incorporate in our models the notion of a self (or many selves). A consequence of including a model of self requires the modeling of embodiment within an ecosystem. A cognitive being must incorporate a first person model that is aware of its own interaction and can perceive its interaction with its ecosystem.
Complex cognitive behavior to maintain self can be found in the simplest of biological cells. Michael Levin has shown that the process of morphogenesis requires complex cognition (i.e. bioelectrical computation) in the absence a central nervous system. This idea throws a major wrench in our own descriptive models of the human brain. If every neuron has the same complex behavior as eukaryotic cells then the coordinated dynamic models found in Deep Learning may be inadequate to capture basic capabilities such as adaptation and self-repair.
10. Non-Stationary Cognitive Models
The cognitive models that we have are static and are tuned to perform single tasks. However, to achieve the kind of adaptability found in biology, our cognitive models must be non-stationary, continual, just-in-time and conversational. Adapting to ecosystems requires continuous context switching where the most beneficial heuristic is selected.
How is structure preserved in an a complex and dynamic ecosystem? Biological organisms are self-regulating and ecosystems with many intentional actors are also self-regulating. There is a higher level mechanism that keeps things from getting too far off kilter. Is this the same mechanism that allows us to keep track of complex conversations?
11. Collective not individual
Models of self cannot be divorced from the ecologies where they have been developed (see: Embededness) . In fact, capabilities can only be developed in ecologies very different from where an organism finds itself in. Life, for example, requires amino acids, DNA and lipids. How did all three requirements evolve to become essential for each other? It is likely that each evolved separately in different ecosystems and evolved into a synergistic relationship in the current biosphere. Life in general cannot thrive independent of the ecosystem that is belongs to. This should also imply that advanced cognitive capabilities like language could not have evolved separate from the cultural environment humans find themselves in. Human have language not because of an innate capability (see: Chomsky’s Merge or Marcus) but rather because we have a unique cultural system that teaches us language.
12. Pre-Natal and Neoteny Development
Does one ever wonder the purpose of mammals (other than the more primitive marsupials) have their offspring gestate in the womb? What about the human baby that has full cognitive capabilities for several weeks before they leave the womb and are born? We make the assumption that human cognitive development begins after birth when there’s ample evidence that cognitive development begins pre-birth and therefore a lot of the mother’s interaction and reaction to the world is reflected onto her child in the womb. What does a child learn while in the womb?
The other unique fact about humans is that they remain child like for a much longer time. Humans between 2.5–3 years old have similar cognitive capabilities as the great apes. The retention of juvenile features in animals is called neoteny. Dogs have retained larger eyes, floppy ears and shorter snouts compared to their wolf cousins. The Axolotl remains as infants their entire lives and have the remarkable capability of regenerating most parts of their bodies.
Recent experiments have revealed that in 299 age-linked genes in primates that were studied, 40 genes in humans are expressed later in life. In fact, it is known that humans who have faster than normal maturity can lead to a reduction of cognitive capabilities. There are indeed real cognitive benefits in remaining young much longer. A young person has higher brain plasticity than adults and being young longer affords humans enough runway to develop their socially attained cognitive skills (i.e. imitation, language and empathy). Humans are perhaps more intelligent than other species because we have a learning strategy of curiosity and exploration for a much longer time.
This final fact should tell you that an ossified learning strategy, one that clings to their ‘guns or religion or antipathy’ are in danger of failing to innovate. Those who seek out novel paths of exploration are most likely to discover something new.
As you might have noticed, each of these blind-spots are somewhat related to the other blind-spots. They aren’t exactly orthogonal but rather in combination they reveal a new holistic perspective for more advanced AGI research. It indeed is interesting that this list has a distinct flavor to it that differs from the overly engineering logic approach of most AGI research. This is perhaps due to the cognitive biases inculcated by most science, engineering and math curriculums. You cannot build AGI from the top down. Rather, it has to be done from the bottom up and that means it has to be grown. Thus an organic logic approach is needed. Although, the idea of growing intelligence seems obvious, there are surprisingly plenty of researchers who insist against this perspective. There is the prevalent bias that AI work is engineering, when in fact it is more closely related to psychology and biology.
Dave Ackley writes in Beyond Efficiency about why our computational systems are so fragile. He attributes this to a ‘Correctness and Efficiency Only’ mindset. The drum beat of constant computer security failures tells us that something drastically needs to be done to incorporate the kind of robustness we find in biology. Ackley blames the problem on the lack of emphasis on “Systems Thinking”. Systems Thinking emphasizes a holistic approach (focus on whole) with a focus on structure. This is related to Cybernetics that studies control, feedback, regulation and interaction. Advances in AGI will require better understanding of human cognition as well as an appreciation of the complexity of biology. I will thus have to rephrase Arthur C. Clarke’s quote about technology. It is more accurate to say:
Any sufficiently advanced technology is indistinguishable from biology.