🤖 AI Summary
This work identifies systematic inequities in large language models’ (LLMs) multilingual continual learning of new knowledge. To address this, we propose the first dynamic fairness analytical framework, evaluating four critical dimensions—effectiveness, transferability, priority preservation, and robustness—thereby moving beyond conventional static capability assessments. Our methodology integrates in-context learning with parameter-efficient fine-tuning, enabling cross-lingual, multi-scenario empirical evaluation across both closed- and open-source LLMs. Results demonstrate that low-resource languages consistently underperform across all four dimensions, with inequities intensifying progressively throughout the continual learning process. This study provides the first systematic empirical evidence characterizing multilingual learning biases in LLMs. Furthermore, it offers actionable, technically grounded recommendations—spanning data curation, training scheduling, and architectural design—to foster more equitable and inclusive next-generation language models.
📝 Abstract
As large language models (LLMs) gradually become integral tools for problem solving in daily life worldwide, understanding linguistic inequality is becoming increasingly important. Existing research has primarily focused on static analyses that assess the disparities in the existing knowledge and capabilities of LLMs across languages. However, LLMs are continuously evolving, acquiring new knowledge to generate up-to-date, domain-specific responses. Investigating linguistic inequalities within this dynamic process is, therefore, also essential. In this paper, we explore inequalities in new knowledge learning by LLMs across different languages and four key dimensions: effectiveness, transferability, prioritization, and robustness. Through extensive experiments under two settings (in-context learning and fine-tuning) using both proprietary and open-source models, we demonstrate that low-resource languages consistently face disadvantages across all four dimensions. By shedding light on these disparities, we aim to raise awareness of linguistic inequalities in LLMs' new knowledge learning, fostering the development of more inclusive and equitable future LLMs.