SyncedReview
Published in

SyncedReview

Amazon’s BERT Optimal Subset: 7.9x Faster & 6.3x Smaller Than BERT

The transformer-based architectures BERT has in recent years demonstrated the efficacy of large-scale pretrained models for tackling natural language processing (NLP) tasks such as machine translation and question answering. BERT’s large size and complex pretraining process however raise usability concerns for many researchers.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Synced

Synced

29K Followers

AI Technology & Industry Review — syncedreview.com | Newsletter: http://bit.ly/2IYL6Y2 | Share My Research http://bit.ly/2TrUPMI | Twitter: @Synced_Global