Uncovering inequalities in new knowledge learning by large language models across different languages

📅 2025-03-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work identifies systematic inequities in large language models’ (LLMs) multilingual continual learning of new knowledge. To address this, we propose the first dynamic fairness analytical framework, evaluating four critical dimensions—effectiveness, transferability, priority preservation, and robustness—thereby moving beyond conventional static capability assessments. Our methodology integrates in-context learning with parameter-efficient fine-tuning, enabling cross-lingual, multi-scenario empirical evaluation across both closed- and open-source LLMs. Results demonstrate that low-resource languages consistently underperform across all four dimensions, with inequities intensifying progressively throughout the continual learning process. This study provides the first systematic empirical evidence characterizing multilingual learning biases in LLMs. Furthermore, it offers actionable, technically grounded recommendations—spanning data curation, training scheduling, and architectural design—to foster more equitable and inclusive next-generation language models.

Technology Category

Application Category

📝 Abstract
As large language models (LLMs) gradually become integral tools for problem solving in daily life worldwide, understanding linguistic inequality is becoming increasingly important. Existing research has primarily focused on static analyses that assess the disparities in the existing knowledge and capabilities of LLMs across languages. However, LLMs are continuously evolving, acquiring new knowledge to generate up-to-date, domain-specific responses. Investigating linguistic inequalities within this dynamic process is, therefore, also essential. In this paper, we explore inequalities in new knowledge learning by LLMs across different languages and four key dimensions: effectiveness, transferability, prioritization, and robustness. Through extensive experiments under two settings (in-context learning and fine-tuning) using both proprietary and open-source models, we demonstrate that low-resource languages consistently face disadvantages across all four dimensions. By shedding light on these disparities, we aim to raise awareness of linguistic inequalities in LLMs' new knowledge learning, fostering the development of more inclusive and equitable future LLMs.
Problem

Research questions and friction points this paper is trying to address.

Inequalities in new knowledge learning by LLMs across languages.
Disparities in effectiveness, transferability, prioritization, and robustness.
Low-resource languages face disadvantages in LLM knowledge acquisition.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Explores LLM knowledge learning inequalities across languages
Uses in-context learning and fine-tuning experiments
Focuses on effectiveness, transferability, prioritization, robustness
🔎 Similar Papers
No similar papers found.
C
Chenglong Wang
School of Urban Planning and Design, Peking University Shenzhen Graduate School, Shenzhen, China; Key Laboratory of Earth Surface System and Human-Earth Relations of Ministry of Natural Resources of China, Peking University Shenzhen Graduate School, Shenzhen, China
H
Haoyu Tang
School of Computer Science and Technology, University of Science and Technology of China, Hefei, China
Xiyuan Yang
Xiyuan Yang
UIUC
Trustworthy Machine Learning
Yueqi Xie
Yueqi Xie
Princeton University
AI and SocietyResponsible AISocial ComputingComputational Social Science
Jina Suh
Jina Suh
Microsoft Research, University of Washington
machine learninghuman computer interactionmental health
Sunayana Sitaram
Sunayana Sitaram
Microsoft Research India
Multilingual NLPevaluationLLMs and culturemultilingualismLLMs
J
Junming Huang
Paul and Marcia Center on Contemporary China, Princeton University, Princeton, USA
Y
Yu Xie
Paul and Marcia Center on Contemporary China, Princeton University, Princeton, USA; Center for Social Research, Guanghua School of Management, Peking University, Beijing, China
Zhaoya Gong
Zhaoya Gong
Peking University Shenzhen Graduate School
GIScienceGeoAIGeocomputationUrban and Regional ScienceCyberinfrastructure
X
Xing Xie
Microsoft Research Asia, Beijing, China
Fangzhao Wu
Fangzhao Wu
Microsoft
Responsible AI