Predicting Downstream Model Performance at Early Training Stages: A New Perspective on Neural Network Selection via Edge Dynamics
Fine-tuning pretrained large-scale deep neural networks (NN) for downstream tasks has become the status quo in the deep learning community. A challenge facing researchers is how to efficiently select the most appropriate pretrained model for a given downstream task, as this process typically entails expensive computational costs in model training for performance prediction.
In the new paper Neural Capacitance: A New Perspective of Neural Network Selection via Edge Dynamics, a research team from Rensselaer Polytechnic Institute, Thomas J. Watson Research Center and the University of California, Los Angeles proposes a novel framework for effective NN selection for downstream tasks. The method is designed to forecast the predictive ability of a model with its cumulative information, and to save resources by doing so in the early phase of NN training.
The team summarizes their contributions as:
- View NN training as a dynamical system over synaptic connections, and first time investigate the interactions of synaptic connections in a microscopic perspective.
- Propose neural capacitance metric βeff for neural network model selection.
- Empirical results of 17 pre-trained models on five benchmark datasets show that our βeff based approach outperforms current learning curve prediction approaches.
- For rank prediction according to the performance of pre-trained models, our approach improves by 9.1/38.3/12.4/65.3/40.1% on CIFAR10/CIFAR100/SVHN/Fashion MNIST/Birds over the best baseline with observations from learning curves of length only 5 epochs.
The proposed framework is based on the idea that backpropagation during NN training is equivalent to the dynamical evolution of synaptic connections (edges) and that a converged neural network is…