Are Multilingual Language Models Fragile? IBM Adversarial Attack Strategies Cut MBERT QA Performance by 85%

Published in

SyncedReview

4 min readApr 22, 2021

As large language models continue to achieve state-of-the-art (SOTA) results on question answering (QA) tasks, researchers are raising a few questions of their own concerning the robustness of these models. An IBM team recently conducted a comprehensive analysis of English QA that suggests SOTA models can be disappointingly fragile when presented with adversarially generated data.

Previous attack strategy studies have focused on monolingual QA performance, while attacks on multilingual QA have remained relatively unexplored. The IBM researchers take aim at the latter, applying four novel multilingual adversarial attack strategies against seven languages in a zero-shot setting. Faced with such attacks, the average performance of large multilingual pretrained language models such as MBERT tumbles by at least 20.3 percent and as much as 85.6 percent.

The researchers summarize their main contributions as exposing flaws in multilingual QA systems and providing insights that are not evident in a single language system, specifically:

MBERT is more…

Are Multilingual Language Models Fragile? IBM Adversarial Attack Strategies Cut MBERT QA Performance by 85%

Written by Synced