The #paperoftheweek is “The Lottery thicket Hypothesis: Finding Sparse, Trainable Neural Networks”.
The authors of this paper propose a novel way to prune large trained neural networks, generating sparse and lightweight subnetworks (lottery winning tickets) that achieve at least equal task performance in at most equal training time and with fewer parameters than their big initial counterparts. The lottery ticket hypothesis states the following: “A randomly-initialized, dense neural network contains a subnetwork that is initialized such that — when trained in isolation — it can match the test accuracy of the original network after training for at most the same number of iterations.” The process of getting the lottery winning ticket starts with randomly initializing a neural network, train the network for j iterations to arrive at the parameters, prune a percentage of these parameters, reset the remaining parameters to their initial values and finally retrain the subnetwork — the winning ticket — . Is interesting how crucial is to use the initial reset parameters to retrain the subnetwork, since not doing so will harm the subnetwork task performance. The authors investigate this issue by comparing the winning ticket results with randomly initialized parameters and using the reset initial ones. Randomly initializing the parameters harms the learning process, while using the original ones translates in superior results. They show these discoveries testing on fully connected networks and also convolutional networks. As an amazing fact, in some experiments the even prune 93% of the parameters to produce the winning ticket and they get better performance than the original big neural network. In general, the behavior of the pruning process sets an optimal performance for a specific percentage of pruned parameters. The subnetwork performance raises up to this optimal pruning value. Beyond this value, the subnetwork decreases its task performance until it reaches the performance of the original big neural network. The implications of this work can be used to exploit different questions, like for example how to improve training performance, how to design better networks and improve the theoretical understanding of deep neural networks.
“Neural network pruning techniques can reduce the parameter counts of trained net- works by over 90%, decreasing storage requirements and improving computational performance of inference without compromising accuracy. However, contemporary experience is that the sparse architectures produced by pruning are difficult to train from the start, which would similarly improve training performance. We find that a standard pruning technique naturally uncovers subnetworks whose initializations made them capable of training effectively. Based on these results, we articulate the lottery ticket hypothesis: dense, randomly-initialized, feed-forward networks contain subnetworks (winning tickets) that — when trained in isolation — reach test accuracy comparable to the original network in a similar number of iterations. The winning tickets we find have won the initialization lottery: their connections have initial weights that make training particularly effective. We present an algorithm to identify winning tickets and a series of experiments that support the lottery ticket hypothesis and the importance of these fortuitous initializations. We consistently find winning tickets that are less than 10–20% of the size of several fully-connected and convolutional feed-forward architectures for MNIST and CIFAR10. Above this size, the winning tickets that we find learn faster than the original network and reach higher test accuracy.”
You can read the full article here.
About the author:
Ignacio Alvizu, Deep Learning Researcher at Brighter AI.
About Brighter AI:
Brighter AI has developed an innovative privacy solution for visual data: Deep Natural Anonymization. The solution replaces personally identifiable information such as faces and licenses plates with artificial objects, thereby enabling all AI and analytics use cases, e.g. self-driving cars and smart retail. In 2018, NVIDIA named the German company “Europe’s Hottest AI Startup”.