The Emergence of Inside Out Architectures in Deep Learning
There are many misunderstandings that AI researchers hold on to that lead them to dead ends. The most well known one is the idea that has driven GOFAI (Good Old Fashion AI) since the 1950s. That is, intelligence can be reducible to simulating logical reasoning. Deep Learning are a form of intuition machines and have unequivocally demonstrated an alternative path towards logical reasoning. There are two other misunderstandings that need to be recognized to make further advances in general intelligence. These two ideas are related. The first is that the brain has an “Inside Out” architecture where it generates an internal mental model to perform predictions. The second is that intelligence is not related to consciousness.
I good introduction to these two ideas can be found in Anil Seth’s (a neuroscients) Ted Talk that I recommend that you watch:
Anil Seth has a similar talk where he argues that we are all “beast machines”. I would use the equivalent term “intuition machines”. My own research is consistent with Seth’s theories that he derives from neuroscience. In fact, it is all consistent with the theories proposed by Daniel Kahneman (Behavorial Economist), George Lakoff (Cognitive Linguist) and Jonathan Haidt (Social Psychologist). There is sufficient collaborative evidence from fields validates the developments we find in Deep Learning.
In the above Ted Talk, Seth explores the properties of consciousness. He argues that the brain is a prediction engine and our perception is in the form of informed guess-work. That is, our brain performs its best guess of what is out there in the world. The brain forms prior expectations of the world and it uses this to predictions flowing in opposite direction. That is from the inside out, that is our perception is a “controlled hallucination”. This is a very different idea from the more common idea that of “input-output” machines. The vast majority of machine learning work is based on the idea of being able to interpret input that creates an output in the form of prediction. In this Inside Out architecture, prediction comes from internal mental models (i.e. hallucinations) and we are informed of any anomalies only in comparison with our own mental models. Anil Seth describes this prediction architecture in “The Real Problem.” This architecture was proposed by Helmholtz in the 19th century. Our mental models override any real perception. We see this effect in visual and auditory illusions.
Consciousness is our mental models of ourselves (i.e. models of the self). Seth enumerates different models of the conscious self, these he enumerates as bodily (and interoception), perspectival, volitional, narrative, social selves. The purpose of our predictive engine is to ensure that we have the essential internal self models to know how to stay alive. In deep learning research, the bodily self has been simulated in work on ego motion. Progress in AGI research can be tracked by observing which kind of self model has been empirically demonstrated in Deep Learning research.
There is a subtle but significant difference in architecture when you go from just stimulus response to hallucinate a response. It’s as least the difference from an insect brain to that of a mammalian brain. All animals have an ability to react and adjust to the environment. Although mammals have inherited the brains of reptiles to drive its instinctive behavior. Mammals have in addition have a more advanced brain that is able to respond at a more intelligent level. The key development in mammals is the Neocortex that is responsible for higher order functions. The neurons in the Neocortex have evolved to specifically implement an Inside Out architecture.
Blake Richard delivered a talk at ICLR 2018 about that covered the advances in neuroscience in understanding neurons. In Richards talk “Ensembles of Neurocortical Microcircuits” he describes the Pyramidal neurons that are found in the neocortex. These Pyramidal neuron exhibit behavior that provides additional evidence as to a Inside Out architecture:
This Inside Out architecture is being recognized as being a paradigm shift in neuroscience where the more conventional stimulus-response paradigm has been the dominant conceptual framework.
…it was discovered that the likely ancestral state of behavioral organization is one of probing the environment with ongoing, variable actions first and evaluating sensory feedback later (i.e., the inverse of stimulus response).
There are other recent studies that validate this Inside Out paradigm.
Animal movements and internal state transitions generate an internal backdrop of activity that is dynamically…www.biorxiv.org
Summary: A new study sheds light on how the cerebellum is able to make predictions and learn from mistakes, especially…neurosciencenews.com
Now given that the biological brain has an Inside Out architecture and not a stimulus-response architecture, the we should perform considerable effort exploring these kinds of architecture and less of the older (and incorrectly informed) architecture.
DeepMind’s MERLIN paper by Greg Wayne et. al. explores this very idea in much greater detail. Earlier I wrote about the link between sleep and Deep Learning. In sleep, our brains reinforce memories while in non-REM sleep and discover novel associations while in REM sleep. It is alternating between optimization and exploration while we sleep. We shall see this sleep link and combination with an Inside Out architecture. The MERLIN architecture can be quite complicated.
However, it is an extremely compelling architecture that trades off internal hallucination over reinforcement learning. A good metric for measuring intelligence is that of ‘sampling efficiency’. Conventionally, compression is used as a proxy for measuring generalization. However, I propose that ‘sampling efficiency’ is a much better metric that aligns well with this Inside Out paradigm.
One benefit of an Inside Out design is that it is able to react quickly to an environment. It is primed when a context is identified and proceeds to hallucinate the subsequent sequential behavior. Divergence of input from the expected is rapidly recognized and a new context is instantiated to compensate for the unexpected inputs.
Given that there is a bottleneck in its input receptors (i.e. five senses), it needs to be able to learn how to comprehend its environment with the minimal amount of input. This requires an internal context to be available that is aligns with the environmental context. The task requires information from both input and context. This permits the efficient sampling of an environment leveraging the internal contextual model. The most efficient way it can do so is via the massive internal connections within a brain and not through a stimulus-response system. In short, controlled hallucinations is the mechanism to circumvent the well known sample inefficiency of reinforcement learning. Inside Out architecture, by virtue of its drive for efficient sampling, conforms to the “Principle of Least Action” (This is actually a hint that the kind of generative capability found in GANs may be beyond what is necessary).
The MERLIN architecture efficiently learns new policies by playing back from a memory system. MERLIN employs an Inside Out architecture as the basis of performing predictions. This Inside Out architecture is a form of optimistic reasoning (think optimistic transaction). The usual paradigm of stimulus-response is to bake into it a mechanism for incorporating uncertain information. This is what motivates the use of probabilistic methods. However, in an optimistic approach, observations are assumed to be certain and the compensation is performed only when a discrepancy is detected. I discuss the ‘just in time’ reasoning in a previous article.
So let’s summarize what we’ve gather together over the past several articles. Here we show the motivation for Inside Out architectures. Previously, we explored sleep and showed how through dreaming a system is able to learn by replaying past memories. We also showed how to train system to become self-aware in the form of ego-motion. These are all very compelling foundations that are being established to take deep learning systems to the next level of cognition. These are indeed extremely exciting times!
P.S. As I wrote this, I used the term Inside Out and only realized later that its the same name of a Disney Pixar movie. Read more about the psychological basis of that movie here:
Pixar has a proud tradition of taking things that are incapable of expressing human emotion-robots, toys, rats…psmag.com
But in spite of the clear role that generative models and expectations play in brain function, scientists have yet to…www.quantamagazine.org
Learn more about Deep Learning: