🤖 AI Summary
This work addresses the poor cross-lingual consistency and significantly degraded performance of pretrained language models (e.g., BERT, GPT, Mistral, TowerInstruct, OpenHathi) on low-resource languages in multilingual settings. We investigate how model editing affects linguistic fairness—specifically, whether and how knowledge editing alters performance across languages. To this end, we propose two novel stress-testing paradigms: ELFI (Each Language as a Free Independent entity) and ELFO (Each Language as a Friendly Obligatory collaborator), systematically evaluating cross-lingual performance shifts induced by editing across eight diverse languages. Experimental results demonstrate that targeted editing can improve accuracy for low-resource languages (e.g., Tamil, Kannada), yet may exacerbate inconsistencies between certain language pairs. Based on these findings, we introduce the first evaluation framework explicitly designed for assessing language fairness in model editing. This study provides empirical evidence and a methodological foundation for enhancing inclusivity and equitable performance in multilingual large language models.
📝 Abstract
The integration of pretrained language models (PLMs) like BERT and GPT has revolutionized NLP, particularly for English, but it has also created linguistic imbalances. This paper strategically identifies the need for linguistic equity by examining several knowledge editing techniques in multilingual contexts. We evaluate the performance of models such as Mistral, TowerInstruct, OpenHathi, Tamil-Llama, and Kan-Llama across languages including English, German, French, Italian, Spanish, Hindi, Tamil, and Kannada. Our research identifies significant discrepancies in normal and merged models concerning cross-lingual consistency. We employ strategies like 'each language for itself' (ELFI) and 'each language for others' (ELFO) to stress-test these models. Our findings demonstrate the potential for LLMs to overcome linguistic barriers, laying the groundwork for future research in achieving linguistic inclusivity in AI technologies.