BANANAS: A new method for neural architecture search

Colin White
Oct 28 · 3 min read

In this post, we discuss a new state-of-the-art algorithm for neural architecture search.
Arxiv paper: https://arxiv.org/abs/1910.11858
Source code: https://www.github.com/naszilla/bananas

(Image source: http://neuralnetworksanddeeplearning.com/images/tikz41.png)

Neural architecture search (NAS) is one of the hottest research areas in machine learning, with hundreds of papers released in the last few years (see this website). In neural architecture search, the goal is to use an algorithm (sometimes even a neural network) to learn the best neural architecture for a given dataset.

The most popular techniques for NAS include reinforcement learning, evolutionary algorithms, Bayesian optimization, and gradient-based methods. Each technique has its strengths and drawbacks. For example, Bayesian optimization (BayesOpt) is theoretically one of the most promising methods, and has seen huge success in hyperparameter optimization for ML, but it is very challenging to run Bayesian optimization for NAS in practice. Bayesian optimization works by modeling the space of neural architectures, and then automatically telling you which neural architecture to try next. See our previous blog post for an introduction to BayesOpt for NAS. However, setting up BayesOpt for NAS requires a huge amount of human effort in creating a hand-crafted distance function and tuning a Gaussian Process.

Schematic of the meta neural network from BANANAS

In our new paper, we design BANANAS, a novel NAS algorithm which uses Bayesian optimization with a neural network model instead of a GP model. That is, in every iteration of Bayesian optimization, we train a meta neural network to predict the accuracy of unseen neural architectures in the search space. This technique gets rid of the aforementioned problems with Bayesian optimization NAS: the model is powerful enough to predict neural network accuracies, and there is no need to construct a distance function between neural networks by hand.

Path-based encoding of a neural architecture

We use a path-based encoding scheme to encode a neural architecture, which drastically improves the predictive accuracy of our meta neural network. After training on just 200 random neural architectures, we are able to predict the validation accuracy of a new neural architecture to within one percent of its true accuracy on average, for multiple popular search spaces. BANANAS also utilizes a novel variant of Thompson sampling for the acquisition function in Bayesian optimization.

Neural architecture search experiments on the NASBench search space on CIFAR-10

We tested BANANAS on two of the most popular search spaces, the NASBench and DARTS search spaces, and our algorithm performed better than all other algorithms we tried, including evolutionary search, reinforcement learning, standard BayesOpt, AlphaX, ASHA, and DARTS. The best architecture found by BANANAS achieved 2.57% test error on CIFAR-10, on par with state-of-the art NAS algorithms.

Normal cell (left) and reduction cell (right) of the best architecture learned by BANANAS on the DARTS search space.

Included in the GitHub repository is a Jupyter notebook which lets you easily train a meta neural network on the NASBench dataset. Input your favorite combination of hyperparameters to try to achieve the best prediction accuracy on NASBench!

RealityEngines.AI Blog

Learn about cutting edge developments in Articial Intelligence, Machine learning and More

Colin White

Written by

Research Scientist at RealityEngines.AI

RealityEngines.AI Blog

Learn about cutting edge developments in Articial Intelligence, Machine learning and More

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade