Are Knowledge and Reference in Multilingual Language Models Cross-Lingually Consistent?

📅 2025-07-17

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

This study investigates cross-lingual consistency of factual knowledge and coreference resolution in multilingual large language models (LLMs). To this end, we construct a semantically aligned cross-lingual coreference evaluation dataset and propose a comprehensive assessment framework integrating mixed-language inputs, layer-wise interpretability analysis, cross-lingual word alignment supervision, and code-switching training. Our experiments reveal, for the first time, a language-invariant consistency bottleneck at specific hidden layers, with linguistic distance and morphological typology significantly modulating consistency levels across languages. Crucially, we demonstrate that incorporating code-switching training and explicit cross-lingual alignment objectives substantially improves consistency. These findings provide both theoretical grounding and a reproducible technical pathway for modeling and optimizing knowledge alignment mechanisms in multilingual LLMs.

Technology Category

Application Category

📝 Abstract

Cross-lingual consistency should be considered to assess cross-lingual transferability, maintain the factuality of the model knowledge across languages, and preserve the parity of language model performance. We are thus interested in analyzing, evaluating, and interpreting cross-lingual consistency for factual knowledge. We examine code-mixed coreferential statements conveyed identical knowledge across languages to study cross-lingual knowledge consistency. We use some interpretability approaches to analyze the behavior of a model in cross-lingual contexts, discovering that multilingual models show different levels of consistency, subject to language families, linguistic factors, and a bottleneck in cross-lingual consistency on a particular layer. In addition, we evaluate common strategies aimed at improving multilingual performance to observe whether these strategies can improve knowledge consistency at the same time. While knowledge is not cross-lingual consistency in many cases, code-switching training and cross-lingual word alignment objectives show the most promising results, emphasizing the noteworthiness of cross-lingual alignment supervision and code-switching training for both multilingual performance and cross-lingual consistency enhancement.

Problem

Research questions and friction points this paper is trying to address.

Assessing cross-lingual consistency in multilingual models' knowledge transfer

Analyzing factual knowledge consistency across different languages

Evaluating strategies to improve multilingual performance and knowledge consistency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzing cross-lingual knowledge consistency via code-mixed statements

Using interpretability approaches to model cross-lingual behavior

Enhancing consistency with code-switching and word alignment

🔎 Similar Papers

Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models