BERT-of-Theseus: Compressing BERT by Progressive Module Replacing

Synced
SyncedReview
Published in
3 min readFeb 29, 2020

Content provided by Wangchunshu Zhou, co-first author of the paper BERT-of-Theseus: Compressing BERT by Progressive Module Replacing.

What’s New: In this paper, we propose a novel model compression approach to effectively compress BERT by progressive module replacing. Compared to the previous knowledge distillation approaches for BERT compression, our approach leverages only one loss function and one hyper-parameter, liberating human effort from hyper-parameter tuning. Our approach outperforms existing knowledge distillation approaches on GLUE benchmark, showing a new perspective of model compression.

How It Works: It works by progressively substitutes modules of BERT with modules of fewer parameters. Our approach first divides the original BERT into several modules and builds their compact substitutes. Then, we randomly replace the original modules with their substitutes to train the compact modules to mimic the behavior of the original modules. We progressively increase the probability of replacement through the training. In this way, our approach brings a deeper level of interaction between the original and compact models, and smooths the training process.

Key Insights: Jointly training with module replacement may be a promising approach for compressing large neural network models.

Anything else: Applying this model compression approach for ResNet-like models is interesting.

The paper BERT-of-Theseus: Compressing BERT by Progressive Module Replacing is on arXiv.

Meet the authors Canwen Xu, Wangchunshu Zhou, Tao Ge, Furu Wei and Ming Zhou from Wuhan University, Beihang University and Microsoft Research Asia.

Share Your Research With Synced Review

Share My Research is Synced’s new column that welcomes scholars to share their own research breakthroughs with over 1.5M global AI enthusiasts. Beyond technological advances, Share My Research also calls for interesting stories behind the research and exciting research ideas. Share your research with us by clicking here.

Thinking of contributing to Synced Review? Synced’s new column Share My Research welcomes scholars to share their own research breakthroughs with global AI enthusiasts.

We know you don’t want to miss any story. Subscribe to our popular Synced Global AI Weekly to get weekly AI updates.

Need a comprehensive review of the past, present and future of modern AI research development? Trends of AI Technology Development Report is out!

2018 Fortune Global 500 Public Company AI Adaptivity Report is out!
Purchase a Kindle-formatted report on Amazon.
Apply for Insight Partner Program to get a complimentary full PDF report.

--

--

Synced
SyncedReview

AI Technology & Industry Review — syncedreview.com | Newsletter: http://bit.ly/2IYL6Y2 | Share My Research http://bit.ly/2TrUPMI | Twitter: @Synced_Global