Dynamic Neural Architecture Search with Intel Neural Compressor
A More Efficient Way To Do Neural Architecture Searching
Authors: Xinyu Ye, Haihao Shen, Anthony Sarah, Daniel Cummings, and Maciej Szankin, Intel Corporation
Intel Neural Compressor is an open-source Python library for model compression. It reduces the model size and increases the speed of deep learning (DL) inference for deployment on CPUs or GPUs. It provides unified interfaces across multiple DL frameworks for popular network compression technologies like quantization, pruning, knowledge distillation, and neural architecture search (NAS). In this blog, we introduce a super-network-based NAS approach called dynamic neural architecture search (DyNAS) that is >4x more sample efficient than typical one-shot, predictor-based NAS approaches.
NAS
NAS has seen rapid growth in the machine learning research community. It automates the discovery of optimal deep neural network architectures in domains like computer vision and natural language processing. While there have been many recent advancements, there is still a significant focus on making the search more efficient to reduce the computational cost incurred when validating discovered architectures.
Super-Networks
The computational overhead of evaluating deep neural network architectures during the search process can be costly due to the training and validation cycles. Novel weight-sharing approaches known as one-shot or super-networks offer a way to mitigate the training overhead. These approaches train a task-specific super-network architecture with a weight-sharing mechanism that allows the sub-networks to be treated as unique individual architectures. This enables sub-network model extraction and validation without a separate training cycle (Figure 1).
DyNAS Methodology
Evolutionary algorithms, specifically genetic algorithms, have a history of usage in NAS and continue to gain popularity as an efficient way to explore the architecture objective space. We will show how evolutionary algorithms can be paired with lightly trained objective predictors in an iterative cycle to accelerate multi-objective architectural exploration. Specifically, we use a bi-level optimization approach denoted as DyNAS (Figure 2).
In the first phase of the search, a small population of sub-networks is randomly sampled from the super-network and evaluated (validation measurement) to provide the initial training set for the inner predictor loop. After the predictors are trained, a multi-objective evolutionary search is performed in the predictor objective space. After this extensive search is performed, the best performing sub-network configurations are selected to be the next iteration’s validation population. The cycle continues until the user-defined evaluation count is met.
NAS API in Intel Neural Compressor
There are two ways to utilize the NAS API. One way is to use a YAML configuration file, e.g.:
agent = NAS('config.yaml')
results = agent.search()
The config.yaml file will look like the following, where the approach section specifies which NAS approach to use:
nas:
approach: dynas
search:
search_algorithm: 'nsga2'
dynas:
supernet: 'ofa_resnet50’
metrics: ['acc', 'macs']
…
The other way is to use the NASConfig class, which is demonstrated in the code snippet below, which demonstrates how to perform a multi-objective NAS on a MobileNetV3 one-shot weight-sharing super-network for the image classification task on ImageNet-ilsvrc2012:
from neural_compressor.conf.config import NASConfig
from neural_compressor.experimental.nas import NAS
config = NASConfig(approach='dynas', search_algorithm='nsga2')
config.dynas.supernet = 'ofa_mbv3_d234_e346_k357_w1.2'
config.dynas.metrics = ['acc', 'macs']
config.dynas.population = 50
config.dynas.num_evals = 250
config.dynas.results_csv_path = 'search_results.csv'
config.dynas.batch_size = 64
config.dynas.dataset_path = '/datasets/imagenet-ilsvrc2012' #example
agent = NAS(config)
results = agent.search()
It uses the Intel Neural Compressor DyNAS search approach. In this example, we use NSGA-II as the search algorithm, accuracy (ImageNet Top-1 Accuracy (%)) and macs (Multiply-and-accumulates as measured from FVCore) as the search metrics.
Results from the search process will be pairs of model architectures and metrics of such architectures (Figure 3). For analysis, pareto fronts of the search result can be drawn with the code below. For more details about this example, please refer to MobileNetV3 supernet NAS notebook example.
### Plot Search Results in the Multi-Objective Space
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.cm import ScalarMappable
fig, ax = plt.subplots(figsize=(7,5))
number_of_evals = 500
df_dynas = pd.read_csv('results_mbnv3w1p2_macs.csv')[:number_of_evals]
df_dynas.columns = ['config', 'date', 'lat', 'macs', 'top1']
cm = plt.cm.get_cmap('viridis_r')
count = [x for x in range(len(df_dynas))]
ax.scatter(df_dynas['macs'].values, df_dynas['top1'].values, marker='^', alpha=0.8, c=count,
cmap=cm, label='Discovered DNN Model', s=10)
ax.set_title(f'Intel® Neural Compressor\nDynamic NAS (DyNAS)\nSupernet:{config.dynas.supernet}')
ax.set_xlabel('MACs', fontsize=13)
ax.set_ylabel('Top-1 Accuracy (%)', fontsize=13)
ax.legend(fancybox=True, fontsize=10, framealpha=1, borderpad=0.2, loc='lower right')
ax.grid(True, alpha=0.3)
# Eval Count bar
norm = plt.Normalize(0, len(df_dynas))
sm = ScalarMappable(norm=norm, cmap=cm)
cbar = fig.colorbar(sm, ax=ax, shrink=0.85)
cbar.ax.set_title(" Evaluation\n Count", fontsize=8)
fig.tight_layout(pad=2)
plt.show();
Future Work
We plan to add more NAS approaches in Intel Neural Compressor. We invite users to try this DyNAS example and send us feedback through GitHub issues.
We encourage you to check out Intel’s other AI Tools and Framework optimizations and learn about the unified, open, standards-based oneAPI programming model that forms the foundation of Intel’s AI Software Portfolio.