Equilibrium Discovery in Modular Deep Learning Architectures

Published in

Intuition Machine

5 min readDec 15, 2016

Credit: https://unsplash.com/search/balance?photo=zGqVUL30hF0

Deep Learning (DL) systems will evolve from the current monolithic systems, to more modular systems. The traditional DL system is trained end-to-end with a single objective function and optimization algorithm. We however are already seeing newer systems like GANs, that involve more than one DL system. GANs employ a generator and a discriminator, that are in an adversarial relationship, competing against each other. The main difficulty of training GANs is that finding an equilibrium is difficult.

As we progress to even more complex systems, that involve more than two participating DL systems, we need to have some guidance as how to coordinate multiple parties. Here is an example of one of the newer networks that reveal 6 different DL systems working in an adversarial context:

Credit: Stacked Generative Adversarial Networks

More advanced DL systems will require even more complex coordination mechanisms.

Civilization actually has a mechanism that is very efficient in coordinating multiple parties. We use currency (i.e. money) to coordinate multiple parties with competing agendas. Perhaps, through the use of market driven dynamics, we can invent a better mechanism for handling coordination.

It is indeed interesting that in Physics, there are many conservation laws. These laws such as the conservation of momentum or the conservation of energy govern the physical world and give order to chaos. Without conservation laws, strange things would happen in our world all the time. The current understanding of the Higgs boson (aka The God particle) is that its mass is at a certain level that if it were any less or any more, a universe like ours would not exist. Our universe is in exists, because some God particle is of a certain specific mass. The ultimate hyper-parameter!

Market driven systems have money as the conserved quantity (with the exception of Quantitative Easing). The basic conceptual idea here is to have the market of networks coordinate themselves with money. A virtual currency so to speak. To drive the system towards objects, one would institute monetary incentives or disincentives. There isn’t any explicit objective function here other than having mechanism that would “encourage” certain behaviors.

Market driven machine learning has in fact been written about previously. In a paper entitled “An Introduction to Artificial Prediction Markets for Classification” the authors describe a framework for fusing multiple classifiers:

The obtained artificial prediction market is shown to be a maximum likelihood estimator. It generalizes linear aggregation, existent in boosting and random forest, as well as logistic regression and some kernel methods.

Another paper entitled “Multi-period Trading Prediction Markets with Connections to Machine Learning” goes one step further by introducing market makers into the mix:

The analysis shows that the whole market effectively approaches a global objective, despite that the market is designed such that each agent only cares about its own goal. Additionally, the market dynamics provides a sensible algorithm for optimising the global objective. An intimate connection between machine learning and our markets is thus established, such that we could … solve machine learning problems by setting up and running certain markets.

Finally, David Balduzzi has a paper “Cortical Prediction Markets” where he poses the argument that the spiking neural networks are driven by market forces:

“How does the brain encode information about the environment into its structure?”
More importantly, the corollary provides a foundation for cooperative learning. Consider the following basic schema to incentivize rational agents to collaborate:
(i) each agent estimates its usefulness to other agents,
(ii) incorporates the estimate into its reward function and
(iii) thus maximizes its usefulness to the collective.

A recent DeepMind paper hidden in the ICLR 2017 haystack entitled “Metacontrol for Adaptive Imagination-Based Optimization” discusses a particular kind of RL algorithm that coordinates the behavior of multiple experts through a market driven approach:

Rather than learning a single, fixed policy for solving all instances of a task, we introduce a metacontroller which learns to optimize a sequence of “imagined” internal simulations over predictive models of the world in order to construct a more informed, and more economical, solution.
…
The metacontroller learned to adapt the amount of computation it performed to the difficulty of the task, and learned how to choose which experts to consult by factoring in both their reliability and individual computational resource costs.

In short, you have here systems that pursue solutions via the allocation of a scarce commodity (i.e. currency). Complex organizations require coordination and a distributed mechanism to perform that coordination is through a currency. That’s the beauty of markets, coordination is distributed and that’s a key idea to take away from this.

This indeed is an extremely promising principled approach (rather than the ad-hoc way we have today) to the problem of discovering an equilibrium in a the Collaborative Classification with Imperfect Knowledge (CCIK) level of DL intelligence. See: “The Five Capability Levels of Deep Learning Intelligence”.

John Holland in 2010 had a TEDx talk about “Building Blocks and Innovation” where he describes 2 big problems about Complex Adaptive Systems (CAS):

A theory of mind: internal models that allow anticipation of future actions.
A theory of complex boundary hierarchies of biological cells, ecosystems and economic systems. Boundaries and signals co-evolve in complex adaptive systems.

These two problems are features of market driven systems.

Update: Feb 9, 2017 DeepMind is investigating a similar https://deepmind.com/blog/understanding-agent-cooperation/ approach:

We can think of the trained AI agents as an approximation to economics’ rational agent model “homo economicus”. Hence, such models give us the unique ability to test policies and interventions into simulated systems of interacting agents — both human and artificial.

https://arxiv.org/pdf/1612.05159v1.pdf Improving Scalability of Reinforcement Learning by Separation of Concerns

The Mechanics of n-Player Differentiable Games | DeepMind

The cornerstone underpinning deep learning is the guarantee that gradient descent on an objective converges to local…

deepmind.com

Equilibrium Discovery in Modular Deep Learning Architectures

The Mechanics of n-Player Differentiable Games | DeepMind

The cornerstone underpinning deep learning is the guarantee that gradient descent on an objective converges to local…

Written by Carlos E. Perez