Benefits of modular approach — generalization

By Jaroslav Vítků

Usually, in Deep Learning, the tasks are solved by a big monolithic Artificial Neural Network.

Compared to this, one of the properties of the Badger architecture is modularity: instead of using one big neural network, the Badger should be composed of many small Experts which solve the whole task in a collaborative manner. Further assumption is that these Experts share the weights, therefore are identical at the beginning.

Despite the fact that this approach introduces several complications, it has many benefits as well.

Several benefits of modularity include:

  • System is able to dynamically change its size based on the task, therefore Badger should be able to solve tasks normal ANNs are not able to solve.
  • Scaling with number of Experts: we should be able to train on small scale of problem (e.g. small num. Experts) and then deploy to big scale.
  • Modular policies should generalize better (e.g. [Goyal et al, 2020], [Greff et al, 2017]).

This article uses simple experiments to illustrates the following topics:

  1. Ability to train on small version of the task and then being able to deploy to modified versions of the tasks without changing the Expert policy (weights). Policies trained on small robots generalize to robots of different size and/or shape.
  2. In more complicated case, the tasks usually requires the Experts to communicate with each other. One of open questions is how an efficient communication topology should look like and if it is possible to establish it automatically from data.

Code for the experiments available here at github.

Read the full article here.

Originally published at on August 9, 2020.




