The Multilingual Divide and Its Impact on Global AI Safety

📅 2025-05-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper identifies a critical multilingual capability gap in global AI safety: mainstream large language models exhibit systematic performance degradation and heightened security vulnerabilities in non-dominant languages—particularly African, Southeast Asian, and Indigenous languages. Methodologically, the study establishes the first causal framework linking “linguistic disparity” to “security inequity,” integrating cross-lingual safety evaluation, multi-source modeling of linguistic capability gaps, and policy pathway simulation. It identifies three structural barriers: data scarcity, absence of standardized multilingual safety benchmarks, and insufficient model transparency. The contribution is a pragmatic, governance-oriented framework for multilingual AI safety, comprising (1) development of high-quality, diverse multilingual datasets; (2) open-sourced, cross-lingual safety evaluation benchmarks; and (3) a multilateral collaborative governance mechanism. The findings provide both theoretical grounding and actionable policy levers to address global disparities in AI safety capacity.

Technology Category

Application Category

📝 Abstract
Despite advances in large language model capabilities in recent years, a large gap remains in their capabilities and safety performance for many languages beyond a relatively small handful of globally dominant languages. This paper provides researchers, policymakers and governance experts with an overview of key challenges to bridging the"language gap"in AI and minimizing safety risks across languages. We provide an analysis of why the language gap in AI exists and grows, and how it creates disparities in global AI safety. We identify barriers to address these challenges, and recommend how those working in policy and governance can help address safety concerns associated with the language gap by supporting multilingual dataset creation, transparency, and research.
Problem

Research questions and friction points this paper is trying to address.

Addressing the AI capability gap for non-dominant languages
Reducing safety risks disparities across different languages
Promoting multilingual datasets and research transparency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Support multilingual dataset creation
Enhance transparency in AI models
Promote research on language disparities
🔎 Similar Papers
No similar papers found.