The State of Multilingual LLM Safety Research: From Measuring the Language Gap to Mitigating It

📅 2025-05-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study exposes severe English-centric bias in LLM safety research: among nearly 300 top-tier conference papers (2020–2024), over 96% are in English, with virtually no dedicated safety studies for high-resource non-English languages (e.g., Chinese, Spanish, Arabic) and poor cross-lingual safety generalization and inconsistent evaluation documentation. Through a systematic literature review, multi-source metadata mining, and language-topic mapping analysis, we conduct the first comprehensive mapping of the multilingual safety research coverage gap. We propose three novel research directions: (1) a multilingual safety evaluation framework, (2) methods for constructing safety-aware multilingual training data, and (3) mechanisms for cross-lingual safety knowledge transfer. Our contributions include an open-source multilingual safety survey dataset, a reusable evaluation toolkit, and community guidelines adopted by ACL 2025 to advance inclusive, globally representative AI safety paradigms.

Technology Category

Application Category

📝 Abstract
This paper presents a comprehensive analysis of the linguistic diversity of LLM safety research, highlighting the English-centric nature of the field. Through a systematic review of nearly 300 publications from 2020--2024 across major NLP conferences and workshops at *ACL, we identify a significant and growing language gap in LLM safety research, with even high-resource non-English languages receiving minimal attention. We further observe that non-English languages are rarely studied as a standalone language and that English safety research exhibits poor language documentation practice. To motivate future research into multilingual safety, we make several recommendations based on our survey, and we then pose three concrete future directions on safety evaluation, training data generation, and crosslingual safety generalization. Based on our survey and proposed directions, the field can develop more robust, inclusive AI safety practices for diverse global populations.
Problem

Research questions and friction points this paper is trying to address.

Analyzes linguistic diversity gaps in LLM safety research
Highlights neglect of non-English languages in safety studies
Proposes multilingual safety evaluation and training solutions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzes linguistic diversity in LLM safety research
Identifies significant language gap in non-English studies
Proposes multilingual safety evaluation and training solutions
🔎 Similar Papers
No similar papers found.