MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tricks (paper review)

Stan Kriventsov
Deep Learning Reviews
3 min readOct 21, 2020

--

Review of paper by Zhiqiang Shen and Marios Savvides, Carnegie Mellon University, 2020

Originally published in Deep Learning Reviews on October 21, 2020.

The authors used a version of the recently suggested MEAL technique (which involves knowledge distillation from multiple large teacher networks into a smaller student network via adversarial learning) to increase the top-1 accuracy of ResNet-50 on ImageNet with 224×224 input size to 80.67% without external training data or network architecture modifications.

What can we learn from this paper?

That even a relatively small network can be trained to achieve the accuracy of much larger networks with the right approach.

In a way, this is not surprising since modern deep neural networks are designed to be overparameterized to take advantage of the multitude of randomly initialized configurations as described in “The Lottery Ticket Hypothesis” paper, so it makes sense that a smaller network would be sufficient to achieve similar performance, but it is really nice to see how this can be implemented in practice.

Prerequisites (to better understand the paper, what should one be familiar with?)

--

--

Stan Kriventsov
Deep Learning Reviews

Software/ML Engineer at Google. Founder of Deep Learning Reviews: https://www.dl.reviews. Former pro chess and poker player.