Beyond Weaponization: NLP Security for Medium and Lower-Resourced Languages in Their Own Right

📅 2025-07-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the longstanding English-centric bias in NLP security by systematically extending adversarial attacks to 70 low- and medium-resource languages—the first such large-scale multilingual robustness evaluation. Methodologically, we adapt mainstream attack paradigms across languages and conduct parameter-scale analysis and architectural comparisons to identify security vulnerabilities arising from limited linguistic resources. Key contributions include: (1) demonstrating that multilingual capability does not inherently ensure robustness—multilingual LMs exhibit unstable performance on low-resource languages; (2) revealing pervasive security weaknesses in monolingual small-scale models; and (3) advocating language-specific defense mechanisms tailored to low- and medium-resource settings. Our findings provide critical empirical evidence and actionable deployment guidelines for advancing equitable, globally applicable AI security practices.

Technology Category

Application Category

📝 Abstract
Despite mounting evidence that multilinguality can be easily weaponized against language models (LMs), works across NLP Security remain overwhelmingly English-centric. In terms of securing LMs, the NLP norm of "English first" collides with standard procedure in cybersecurity, whereby practitioners are expected to anticipate and prepare for worst-case outcomes. To mitigate worst-case outcomes in NLP Security, researchers must be willing to engage with the weakest links in LM security: lower-resourced languages. Accordingly, this work examines the security of LMs for lower- and medium-resourced languages. We extend existing adversarial attacks for up to 70 languages to evaluate the security of monolingual and multilingual LMs for these languages. Through our analysis, we find that monolingual models are often too small in total number of parameters to ensure sound security, and that while multilinguality is helpful, it does not always guarantee improved security either. Ultimately, these findings highlight important considerations for more secure deployment of LMs, for communities of lower-resourced languages.
Problem

Research questions and friction points this paper is trying to address.

Addressing English-centric bias in NLP security research
Evaluating LM security for lower-resourced languages
Assessing adversarial attack vulnerabilities in multilingual models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Extend adversarial attacks to 70 languages
Evaluate monolingual and multilingual LMs security
Highlight security considerations for lower-resourced languages