The Emergence of Inside Out Architectures in Deep Learning

Published in

Intuition Machine

8 min readJun 17, 2018

Pixar’s “Inside Out”. Use of the same name is purely coincidental.

“The activity of the intuition consists in making spontaneous judgments which are not the result of conscious trains of reasoning. These judgments are often but by no means invariably correct. . . . The exercise of ingenuity in mathematics consists in aiding the intuition through suitable arrangements of propositions, and perhaps geometrical figures or drawings.” — Alan Turing

There are many misunderstandings AI researchers hold on to that lead them to dead ends. The most well-known one is the idea that has driven GOFAI (Good Old Fashion AI) since the 1950s: intelligence can be reducible to simulating logical reasoning. Deep Learning is a form of intuition machine and has unequivocally demonstrated an alternative path towards logical reasoning. There are two other misunderstandings that need to be recognized to make further advances in general intelligence. These two ideas are related. The first is that the brain has an “Inside Out” architecture where it generates an internal mental model to perform predictions. The second is that intelligence is not related to consciousness.

A good introduction to these two ideas can be found in Anil Seth’s (a neuroscientist) Ted Talk that I recommend that you watch:

Anil Seth has a similar talk where he argues that we are all “beast machines”. I would use the equivalent term “intuition machines”. My own research is consistent with Seth’s theories that he derives from neuroscience. In fact, it is all consistent with the theories proposed by Daniel Kahneman (Behavioral Economist), George Lakoff (Cognitive Linguist) and Jonathan Haidt (Social Psychologist). There is sufficient collaborative evidence from fields validates the developments we find in Deep Learning.

In the above Ted Talk, Seth explores the properties of consciousness. He argues that the brain is a prediction engine and our perception is in the form of informed guess-work. That is, our brain performs its best guess of what is out there in the world. The brain forms prior expectations of the world and it uses these expectations to predictions flowing in opposite direction, that is from the inside out and that our perception is a “controlled hallucination”. This is a very different idea from the more common idea of “input-output” machines. The vast majority of machine learning work is based on the idea of being able to interpret input that creates output in the form of prediction. In this Inside Out architecture, prediction comes from internal mental models (i.e. hallucinations) and we are informed of any anomalies only in comparison with our own mental models. Anil Seth describes this prediction architecture in “The Real Problem.” This architecture was proposed by Helmholtz in the 19th century. Our mental models override any real perception. We see this effect in visual and auditory illusions.

Consciousness is our mental models of ourselves (i.e. models of the self). Seth enumerates different models of the conscious self, these he enumerates as bodily (and interoception), perspectival, volitional, narrative, social selves. The purpose of our predictive engine is to ensure that we have the essential internal self-models to know how to stay alive. In deep learning research, the bodily self has been simulated in work on ego motion. Progress in AGI research can be tracked by observing which kind of self-model has been empirically demonstrated in Deep Learning research.

There is a subtle but significant difference in architecture when you go from just stimulus response to hallucinate a response. It’s at least the difference from an insect brain to that of a mammalian brain. All animals have the ability to react and adjust to the environment. Although mammals have inherited the brains of reptiles to drive their instinctive behavior. Mammals, in addition, have a more advanced brain that is able to respond at a more intelligent level. The key development in mammals is the Neocortex that is responsible for higher-order functions. The neurons in the Neocortex have evolved to specifically implement an Inside Out architecture.

Blake Richards delivered a talk at ICLR 2018 which covered the advances in neuroscience in understanding neurons. In Richard’s talk “Ensembles of Neurocortical Microcircuits” he describes the Pyramidal neurons that are found in the neocortex. These Pyramidal neuron exhibit behavior that provides additional evidence as to an Inside Out architecture:

Top-down inputs are internal models that influence prediction from bottom-up inputs (from the real world).

This Inside Out architecture is being recognized as being a paradigm shift in neuroscience where the more conventional stimulus-response paradigm has been the dominant conceptual framework.

…it was discovered that the likely ancestral state of behavioral organization is one of probing the environment with ongoing, variable actions first and evaluating sensory feedback later (i.e., the inverse of stimulus response).

There are other recent studies that validate this Inside Out paradigm.

Movement-related activity dominates cortex during sensory-guided decision making

Animal movements and internal state transitions generate an internal backdrop of activity that is dynamically…

www.biorxiv.org

Decoding the Brain's Learning Machine

Summary: A new study sheds light on how the cerebellum is able to make predictions and learn from mistakes, especially…

neurosciencenews.com

Now given that the biological brain has an Inside Out architecture and not a stimulus-response architecture, the we should perform considerable effort exploring these kinds of architecture and less of the older (and incorrectly informed) architecture.

DeepMind’s MERLIN paper by Greg Wayne et al. explores this very idea in much greater detail. Earlier I wrote about the link between sleep and Deep Learning. In sleep, our brains reinforce memories while in non-REM sleep and discover novel associations while in REM sleep. It is alternating between optimization and exploration while we sleep. We shall see this sleep link and combination with an Inside Out architecture. The MERLIN architecture can be quite complicated.

However, it is an extremely compelling architecture that trades off internal hallucination over reinforcement learning. A good metric for measuring intelligence is that of ‘sampling efficiency’. Conventionally, compression is used as a proxy for measuring generalization. However, I propose that ‘sampling efficiency’ is a much better metric that aligns well with this Inside Out paradigm.

One benefit of an Inside Out design is that it is able to react quickly to an environment. It is primed when a context is identified and proceeds to hallucinate the subsequent sequential behavior. The divergence of input from the expected is rapidly recognized and a new context is instantiated to compensate for the unexpected inputs.

Given that there is a bottleneck in its input receptors (i.e. five senses), it needs to be able to learn how to comprehend its environment with a minimal amount of input. This requires an internal context to be available that is aligned with the environmental context. The task requires information from both input and context. This permits the efficient sampling of an environment leveraging the internal contextual model. The most efficient way it can do so is via the massive internal connections within a brain and not through a stimulus-response system. In short, controlled hallucinations is the mechanism to circumvent the well-known sample inefficiency of reinforcement learning. Inside Out architecture, by virtue of its drive for efficient sampling, conforms to the “Principle of Least Action” (This is actually a hint that the kind of generative capability found in GANs may be beyond what is necessary). Also, a just-in-time cognitive processing fits this argument like a glove.

The MERLIN architecture efficiently learns new policies by playing back from a memory system. MERLIN employs an Inside Out architecture as the basis of performing predictions. This Inside Out architecture is a form of optimistic reasoning (think optimistic transaction). The usual paradigm of stimulus-response is to bake into it a mechanism for incorporating uncertain information. This is what motivates the use of probabilistic methods. However, in an optimistic approach, observations are assumed to be certain and the compensation is performed only when a discrepancy is detected. I discuss the ‘just in time’ reasoning in a previous article.

So let’s summarize what we’ve gathered together over the past several articles. Here we show the motivation for Inside Out architectures. Previously, we explored sleep and showed how through dreaming a system is able to learn by replaying past memories. We also showed how to train a system to become self-aware in the form of ego-motion. These are all very compelling foundations that are being established to take deep learning systems to the next level of cognition. These are indeed extremely exciting times!

P.S. As I wrote this, I used the term Inside Out and only realized later that its the same name of a Disney Pixar movie. Read more about the psychological basis of that movie here:

A Conversation With the Psychologist Behind 'Inside Out'

Pixar has a proud tradition of taking things that are incapable of expressing human emotion-robots, toys, rats…

psmag.com

To Make Sense of the Present, Brains May Predict the Future | Quanta Magazine

But in spite of the clear role that generative models and expectations play in brain function, scientists have yet to…

www.quantamagazine.org

Three Problems for the Predictive Coding Theory of Attention - Minds Online

Madeleine Ransom (University of British Columbia) Sina Fazelpour (University of British Columbia) [PDF of Ransom &…

mindsonline.philosophyofbrains.com

Inverting the Interaction Cycle to Model Embodied Agents

Cognitive architectures should make explicit the conceptual begin and end points of the agent/environment interaction…

www.sciencedirect.com

[1805.11051v1] Flexible and accurate inference and learning for deep generative models

Abstract: We introduce a new approach to learning in hierarchical latent-variable generative models called the…

arxiv.org

AI still fails on robust handwritten digit recognition (and how to fix it)

Deep Learning has been praised to solve everything from self-driving cars to the world-climate. And yet, deep neural…

medium.com

Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by…

@article{ VDBPeng18, title={Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by…

xbpeng.github.io

https://www.youtube.com/watch?v=P0yVuoATjzs

The Algorithmic Level Is the Bridge Between Computation and Brain

What Does It Take for an Artificial Agent to Be Constructivist? (Special Issue in

The special issue will be organized around a number of target articles accompanied by Open Peer Commentaries (OPCs)…

constructivist.info

Ryota Kanai | Consciousness and AI

Edit description

slideslive.com

Predictive Processing: the long road ahead.

In the previous posts in this series I've proposed an extreme synthesis of the Predictive Processing (PP) idea, as…

sergiograziosi.wordpress.com

Emergence of Cognition from Action

Detailed reviews describing work presented at the annual Cold Spring Harbor Symposia on Quantitative Biology

symposium.cshlp.org

Predictive models avoid excessive reductionism in cognitive neuroimaging

Understanding the organization of complex behavior as it relates to the brain requires modeling the behavior, the…

hal.archives-ouvertes.fr

Theory of cortical function

A unified theory of cortical function is proposed for guiding both neuroscience and artificial intelligence research…

www.pnas.org

Learn more about Deep Learning: