Designing Neural Networks Through Neuroevolution

Published in

CognizantAI

8 min readMay 26, 2021

By Risto Miikkulainen, AVP of Evolutionary AI

Neuroevolution — the evolution of neural networks — is a general set of techniques that have been actively researched since the 1990s, well before deep learning.

Many of the ideas in Evolutionary AutoML originate from general neuroevolution research — and now that we have a million times more computing power, they can be applied to deep learning as well. To provide a general context for these ideas, as well as to highlight other results that may be similarly useful in the future, we recently wrote an overview on neuroevolution with Uber AI Labs for the inaugural issue of Nature Machine Intelligence. This post summarizes the main points of that review.

Training neural networks without gradients

The main idea behind neuroevolution is to make it possible to train neural networks when gradients are not available. For instance, the network may contain nondifferentiable activation functions or recurrent connections that make gradient computation difficult. Or the training targets may not be known, as is the case in many sequential decision tasks.

For instance, when neural networks are used to implement agents in game playing, optimal actions may not be obvious; only information about how well an agent plays overall may be available. In such cases, evolution is sometimes used to modify the architecture, for example, the nodes and connections of the network shown in Figure 1 below as well as the connection weight values within it.

**Figure 1: General Neuroevolution Process.** Neural networks are encoded in a population of chromosomes. Each one is decoded into a network and evaluated in the environment, resulting in fitness for encoding. Crossover and mutation are then performed on the best chromosomes to create the next generation of networks. In this way, neuroevolution establishes a parallel population-based search for a network that solves the task at hand — for instance playing a game, controlling a robot, or implementing a business decision-making strategy.

Such tasks are often formulated as reinforcement learning tasks, and in this sense, neuroevolution can be seen as a form of reinforcement learning. It is, however, not based on learning a value function (i.e., assigning values to actions in a state) but is instead a policy search method: The evolved network represents, and evolution discovers, an entire policy at once.

This is an important distinction because it makes it possible to scale reinforcement learning to domains that are partially observable. Whereas the value-function approach works well on MDP tasks (Markov Decision Processes, where the entire state is known), it does not scale well to POMDP (Partially Observable MDP). If it is not clear what the state is, it is not clear which action values should be updated, making learning unreliable.

In contrast, a policy network can be recurrent, and therefore it can disambiguate the state by taking the history of states into account. Thus, the primary initial motivation for neuroevolution was to extend reinforcement learning to POMDP tasks. Several neuroevolution approaches were developed to this end:

A common one is NEAT, or NeuroEvolution of Augmenting Topologies, wherein not only the connection weights but also the very topology of the network is evolved. The approach starts with a simple network and gradually makes it more complex in order to discover optimal recurrency for the task.
Another powerful idea is to evolve components of the network, such as nodes (as in ESP, or Enforced SubPopulations) or weights (as in CoSyNE, or Cooperative Synaps NeuroEvolution) in separate subpopulations and put them together into a full network. With this method, subpopulations learn compatible subtasks, and evolution can search for good combinations efficiently.
A third approach is to use CMA-ES (Evolutionary Strategy with Covariance Matrix Adaptation) to evolve the weights. Based on a statistical model, new individuals are constructed through intelligent weight mutations. Each of these methods achieved state-of-the-art performance in POMDP benchmarks, such as double pole balancing without velocities.

Powerful ideas

In the current era of scale, with a million times more computing power than we once had, these ideas are more powerful than ever before. They can be used to solve more difficult POMDP problems than ever before, including control, robotics, game playing and artificial life. Interestingly, they are also useful in neural architecture search: NEAT can be used to discover topologies, ESP/CoSYNE to discover useful components from which the deep-learning network is constructed, and CMA-ES to optimize continuously-valued deep-learning hyper-parameters.

At the same time, a better understanding of what makes these approaches powerful has started to emerge. While many new reinforcement learning techniques have been developed, some extending to POMDPs, it was surprising to find that a simple Evolutionary Strategy (ES — based on mutation only) works just as well in designing neural networks for such tasks. It turns out that the power of evolution comes from exploration.

Having a population makes it possible to explore more than having a single individual that is incrementally improving. Population-based search can afford a few bad individuals if once in a while such exploration leads to good individuals — a process that sometimes leads to surprising solutions that would be difficult to find through other forms of reinforcement learning. This effect is already present in ES (which searches using mutation only), and it is even more prominent in genetic algorithms (which search using both mutation and crossover).

Neuroevolution advances

Several further ideas have been developed in neuroevolution research that make the approach more powerful — and could also be adapted for deep learning and AutoML:

– Novelty search: The first one is novelty search. The main idea is that in order to maximize exploration and the chance of discovering good solutions that are otherwise hard to find, evolution should optimize not only performance but also the diversity of solutions. That is, it should seek out novel solutions, i.e., those that are different from what has already been discovered.

Such solutions may not always work well on their own, but they might serve as a stepping stone for constructing solutions that do (see Figure 2). Novelty is usually defined in the behavioral space, not in the genotypic space, and requires formal behavior characterization. It is often combined with a performance objective, either numerically or in a multi-objective setting, or by requiring a minimum performance in one objective while optimizing the other. Novelty search is an important idea in that it leads to truly creative solutions, both useful and hard to discover, that can often surpass typical human ingenuity.

**Figure 2: Utilizing Stepping Stones in Novelty Search.** Novelty search maximizes not only solution fitness but also solution diversity. Local performance maxima in a globally diverse population form stepping stones that may lead to the discovering of good and surprising solutions — in this case, the highest peaks surrounded by deep valleys.

– Indirect encoding: A second useful idea is indirect encoding, where genetic encoding specifies a process through which the solution is constructed. There are many ways to implement such a process: It could be a grammatical encoding, a developmental mechanism driven by interaction with the environment or a genetic regulatory network. All of these mechanisms are commonly utilized by biological organisms and, therefore, constitute a compelling way to construct complex systems. For instance, a brain is not fully specified genetically, but rather constructed in a developmental process from a genetic starting point.

One particularly useful indirect encoding mechanism has turned out to be one network specifying the weights of another, in a method called HyperNEAT (see Figure 3). The first network receives as its input the coordinates of two nodes in the second network; as its output, it generates the weight on the connection between those nodes.

This network, called CPPN (compositional pattern-producing network), is evolved through NEAT, with its fitness determined by the performance of the second network. The second network is thus embedded in a metric substrate, which can be sampled at different levels of granularity. Networks that are very large can be generated in this way — including, potentially, deep-learning networks.

**Figure 3: Indirect Encoding in HyperNEAT:** A CPPN is evolved to output connection weights for the task network, embedded in the substrate. The substrate can be sampled at different densities, and weights for very large networks can be evolved in this manner.

– Metalearning: A third category of ideas is in metalearning, i.e., evolving aspects of the neural network other than the weights themselves. These aspects include learning rules, activation functions, loss functions and data selection/augmentation. One important form of metalearning is architecture search, described in a separate blog post. Another form is neuromodulation, which adjusts the plasticity of the connection weights depending on the task and a given reward.

Together with functional modularity, it is possible to avoid catastrophic forgetting. Neuroevolution can thus be used to discover the modularity and the neuromodulation strategy that make lifelong learning possible.

– Coevolution: A fourth idea is coevolution, i.e., discovering complex behavior by evolving two or more populations simultaneously. Coevolution can be collaborative, as in the ESP/CoSYNE methods of evolving partial solutions. Components can also be evolved in subpopulations, together with a population of blueprints that specify how to put them together. This method was originally designed to evolve nodes and networks, but it was recently extended to evolving deep-learning components and deep-learning architectures.

On the other hand, coevolution can be competitive, in which case the two populations try to outdo each other. In this manner, one population provides a fitness function for the other, and that function changes over time. The research challenge is to make sure the conflicts lead to progress in an absolute sense and do not get mired in a loop or an unproductive dimension. Competitive coevolutionary systems are similar to self-play systems, like those used in constructing AlphaGo Zero. They are well understood in both theory and practice, and should be useful in constructing complex systems in the future as well.

Open-ended progress

Indeed, one remaining frontier for neuroevolution is to take advantage of such evolutionary processes to establish open-ended progress. In cognitive science, artificial life, game playing and robotics, an important goal is to be able to generate increasingly more complex behaviors. The goal is not simply to reach a certain level of performance but to discover increasingly more sophisticated performance indefinitely.

Many of the methods described above may turn out to be essential ingredients in this process:

Coevolution may be useful in establishing new challenges for evolution to overcome.
Novelty search may be useful in establishing continually new stepping stones.
Indirect encoding may be useful in developing continually more complex representations as the challenges and stepping stones become more complex.
Neuromodulation may be needed to avoid forgetting earlier behaviors.
A neural architecture search may be needed for modularity and increasing complexity for processing.

A primary challenge that remains in the face of these tools is identifying environments and tasks where open-ended evolution can be tested. Once we can do that, neuroevolution technology might be ready to cross that frontier.

About the Author

Risto Miikkulainen is Associate VP of Evolutionary AI at Cognizant and a Professor of Computer Science at the University of Texas at Austin. His current research focuses on methods and applications of neuroevolution, as well as neural network models of natural language processing and vision, and he is an author of over 430 articles in these research areas.

Designing Neural Networks Through Neuroevolution

Written by Cognizant AI