Linguistic Nepotism: Trading-off Quality for Language Preference in Multilingual RAG

📅 2025-09-17

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

This study identifies a systematic language preference bias in multilingual RAG: large language models (LLMs) disproportionately select documents in dominant languages (e.g., English) over semantically more relevant ones, degrading generation quality. To quantify this bias, we propose a controlled experimental framework leveraging ablation-based variable control and internal state analysis, evaluated across eight languages and six open-source LLMs. Results demonstrate that the bias is especially pronounced for low-resource languages and when relevant documents appear in mid-context positions; English queries exhibit significantly inflated English document citation rates, and citation decisions correlate weakly with document informativeness. Crucially, this work provides the first mechanistic evidence—grounded in model internals—that language preference independently influences citation behavior, decoupled from retrieval or generation confounders. Our findings yield actionable insights for developing fairer, more robust multilingual RAG systems and introduce a reproducible, language-agnostic evaluation framework.

Technology Category

Application Category

📝 Abstract

Multilingual Retrieval-Augmented Generation (mRAG) systems enable language models to answer knowledge-intensive queries with citation-supported responses across languages. While such systems have been proposed, an open questions is whether the mixture of different document languages impacts generation and citation in unintended ways. To investigate, we introduce a controlled methodology using model internals to measure language preference while holding other factors such as document relevance constant. Across eight languages and six open-weight models, we find that models preferentially cite English sources when queries are in English, with this bias amplified for lower-resource languages and for documents positioned mid-context. Crucially, we find that models sometimes trade-off document relevance for language preference, indicating that citation choices are not always driven by informativeness alone. Our findings shed light on how language models leverage multilingual context and influence citation behavior.

Problem

Research questions and friction points this paper is trying to address.

Investigating language bias in multilingual RAG citation behavior

Measuring how document language mixture affects generation quality

Analyzing trade-offs between document relevance and language preference

Innovation

Methods, ideas, or system contributions that make the work stand out.

Measuring language preference via model internals

Controlling document relevance across languages

Analyzing citation-language tradeoffs in multilingual contexts

🔎 Similar Papers

Faux Polyglot: A Study on Information Disparity in Multilingual Large Language Models