BERTLang Helps Researchers Choose Between BERT Models

Synced
SyncedReview
Published in
4 min readMar 13, 2020

Introduced by a Google team led by Jacob Devin in 2018, the powerful Bidirectional Encoder Representations from Transformers (BERT) language model has enabled many breakthroughs in the field of natural language processing (NLP). Google, which built its brand on industry-leading Search performance, says BERT has even dramatically improved the understanding of search queries.

Google has also released a multilingual language model, mBERT, which is trained on a corpus of 104 languages and can be leveraged as a universal language model. While the NLP research community has seen impressive performance from BERT models trained on a particular language, there hasn’t been a clear comparison between mBERT and these language-specific BERT models. To evaluate the advantage of each model from the perspectives of architecture, tasks, and domain, a team of researchers from Bocconi University has prepared an online overview of the commonalities and differences between language-specific BERT models and mBERT.

Currently, approximately GitHub 5000 repositories mention “BERT”. For researchers, deciding which language-specific model best suits their needs is a choice that can affect the entire research project. Models trained on a particular language and tested on specific data domains and tasks commonly draw their training data from sources such as Wikipedia, news, legislative and administrative tests, translated movie subtitles, etc. Common NLP tasks include Named Entity Recognition, Natural Language Inference, Paraphrase Identification, etc. To make sense of the different models and tasks and their interrelationships the Bocconi University research team launched the “BertLang” website.

The team says identifying which mBERT or language-specific models perform best at specific tasks is important for NLP progress and can also impact the use of computational resources. They tested 30 pretrained language-specific BERT models on 29 tasks in 18 languages with 177 different performance results.

Language-specific BERT models scored higher than mBERT in all 29 tasks. Cross-checking the average performance of different language-specific BERT models on various tasks provided additional insights. For example, researchers observed that specialized models for low-resources languages such as Mongolian showed the highest improvement compared to mBERT. The paper suggests this is because the developers of language-specific BERT models are likely to be experts on sourcing and using appropriate development resources beyond Wikipedia, etc.

In the future, the team is planning to add independent verification of reported results and direct comparisons of language-specific BERT models on domains and tasks.

The paper What the [MASK]? Making Sense of Language-Specific BERT Models is on arXiv, and the BERTLang website is here.

Journalist: Fangyu Cai | Editor: Michael Sarazen

To highlight the contributions of women in the AI industry, Synced introduces the Women in AI special project this month and invites female researchers from the field to share their recent research works and the stories behind the idea. Join our conversation by clicking here.

Thinking of contributing to Synced Review? Synced’s new column Share My Research welcomes scholars to share their own research breakthroughs with global AI enthusiasts.

We know you don’t want to miss any story. Subscribe to our popular Synced Global AI Weekly to get weekly AI updates.

Need a comprehensive review of the past, present and future of modern AI research development? Trends of AI Technology Development Report is out!

2018 Fortune Global 500 Public Company AI Adaptivity Report is out!
Purchase a Kindle-formatted report on Amazon.
Apply for Insight Partner Program to get a complimentary full PDF report.

--

--

Synced
SyncedReview

AI Technology & Industry Review — syncedreview.com | Newsletter: http://bit.ly/2IYL6Y2 | Share My Research http://bit.ly/2TrUPMI | Twitter: @Synced_Global