Microsoft’s New MT-DNN Outperforms Google BERT

Synced
Synced
Feb 15, 2019 · 3 min read
Image for post
Image for post

Multi-task learning and language model pre-training are popular approaches for many of today’s natural language understanding (NLU) tasks. Now, Microsoft researchers have released technical details of an AI system that combines both approaches. The new Multi-Task Deep Neural Network (MT-DNN) is a natural language processing (NLP) model that outperforms Google BERT in nine of eleven benchmark NLP tasks.

In their paper Multi Task Deep Neural Networks for Natural Language Understanding, the Microsoft Research and Microsoft Dynamics 365 authors show MT-DNN learning representations across multiple natural language understanding (NLU) tasks. The model “not only leverages large amounts of cross-task data, but also benefits from a regularization effect that leads to more general representations to help adapt to new tasks and domains.”

MT-DNN builds on a model Microsoft proposed in 2015 and integrates the network architecture of BERT, a pre-trained bidirectional transformer language model proposed by Google last year.

Image for post
Image for post

As shown in the figure above, the network’s low-level layers (i.e., text encoding layers) are shared across all tasks, while the top layers are task-specific, combining different types of NLU tasks. Like the BERT model, MT-DNN is trained in two phases: pre-training and fine-tuning. But unlike BERT, MT-DNN adds multi-task learning (MTL) in the fine-tuning phases with multiple task-specific layers in its model architecture.

MT-DNN achieved new SOTA results on ten NLU tasks, including SNLI, SciTail; and eight out of nine GLUE tasks, elevating the GLUE benchmark to 82.2% (1.8% absolute improvement). Researchers also demonstrate that using the SNLI and SciTail datasets, representations learned by MT-DNN allow domain adaptation with substantially fewer in-domain labels than the pre-trained BERT representations.

For more details, please find the paper on arXiv. Microsoft will release the code and pre-trained models.

The GLUE Benchmark leaderboard is here.


Author: Jessie Geng | Editor: Michael Sarazen


2018 Fortune Global 500 Public Company AI Adaptivity Report is out!
Purchase a Kindle-formatted report on Amazon.
Apply for Insight Partner Program to get a complimentary full PDF report.

Image for post
Image for post

Follow us on Twitter @Synced_Global for daily AI news!


We know you don’t want to miss any stories. Subscribe to our popular Synced Global AI Weekly to get weekly AI updates.

Image for post
Image for post
Synced

Written by

Synced

AI Technology & Industry Review — syncedreview.com | Newsletter: http://bit.ly/2IYL6Y2 | Share My Research http://bit.ly/2TrUPMI | Twitter: @Synced_Global

SyncedReview

We produce professional, authoritative, and thought-provoking content relating to artificial intelligence, machine intelligence, emerging technologies and industrial insights.

Synced

Written by

Synced

AI Technology & Industry Review — syncedreview.com | Newsletter: http://bit.ly/2IYL6Y2 | Share My Research http://bit.ly/2TrUPMI | Twitter: @Synced_Global

SyncedReview

We produce professional, authoritative, and thought-provoking content relating to artificial intelligence, machine intelligence, emerging technologies and industrial insights.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store