All Languages Matter: Understanding and Mitigating Language Bias in Multilingual RAG

📅 2026-04-22
📈 Citations: 0
Influential: 0
📄 PDF

career value

154K/year
🤖 AI Summary
This work addresses a significant language bias in multilingual retrieval-augmented generation (mRAG) systems during the reranking stage, where existing rerankers disproportionately favor evidence in English or the query language, thereby suppressing critical cross-lingual information. The study presents the first quantitative analysis of this bias and reveals a substantial performance gap between current rerankers and the theoretical upper bound through oracle evidence estimation. To mitigate this issue, the authors propose LAURA, a language-agnostic, utility-driven reranking alignment method that explicitly aligns multilingual evidence ranking with downstream generation objectives, eliminating reliance on monolingual or query-language cues. Experimental results demonstrate that LAURA consistently improves question-answering accuracy and generation quality across diverse languages and generative models, effectively alleviating language bias.

Technology Category

Application Category

📝 Abstract
Multilingual Retrieval-Augmented Generation (mRAG) leverages cross-lingual evidence to ground Large Language Models (LLMs) in global knowledge. However, we show that current mRAG systems suffer from a language bias during reranking, systematically favoring English and the query's native language. By introducing an estimated oracle evidence analysis, we quantify a substantial performance gap between existing rerankers and the achievable upper bound. Further analysis reveals a critical distributional mismatch: while optimal predictions require evidence scattered across multiple languages, current systems systematically suppress such ``answer-critical'' documents, thereby limiting downstream generation performance. To bridge this gap, we propose \textit{\textbf{L}anguage-\textbf{A}gnostic \textbf{U}tility-driven \textbf{R}eranker \textbf{A}lignment (LAURA)}, which aligns multilingual evidence ranking with downstream generative utility. Experiments across diverse languages and generation models show that LAURA effectively mitigates language bias and consistently improves mRAG performance.
Problem

Research questions and friction points this paper is trying to address.

language bias
multilingual RAG
reranking
cross-lingual evidence
distributional mismatch
Innovation

Methods, ideas, or system contributions that make the work stand out.

language bias
multilingual RAG
reranking
utility-driven alignment
cross-lingual evidence
D
Dan Wang
Chinese Information Processing Laboratory, Institute of Software, Chinese Academy of Sciences; University of Chinese Academy of Sciences
G
Guozhao Mo
Chinese Information Processing Laboratory, Institute of Software, Chinese Academy of Sciences; University of Chinese Academy of Sciences
Y
Yafei Shi
MYbank, AntGroup
C
Cheng Zhang
MYbank, AntGroup
Bo Zheng
Bo Zheng
Researcher, Alibaba Group
AINetworkE-Commerce
Boxi Cao
Boxi Cao
Institute of Software, Chinese Academy of Sciences
Natural Language Processing
Xuanang Chen
Xuanang Chen
Institute of Software, Chinese Academy of Sciences
Information RetrievalNatural Language Processing
Yaojie Lu
Yaojie Lu
Institute of Software, Chinese Academy of Sciences
Information ExtractionLarge Language Models
Hongyu Lin
Hongyu Lin
Institute of Software, Chinese Academy of Sciences
Natural Language ProcessingInformation Extraction and Machine Learning
Ben He
Ben He
Professor, University of Chinese Academy of Sciences
Natural Language ProcessingInformation Retrieval
X
Xianpei Han
Chinese Information Processing Laboratory, Institute of Software, Chinese Academy of Sciences; University of Chinese Academy of Sciences
Le Sun
Le Sun
Institute of Software, CAS
information_retrievalnatural_language_processing