Language-Coupled Reinforcement Learning for Multilingual Retrieval-Augmented Generation

📅 2026-01-21

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

This work addresses the challenge of knowledge bias and cross-lingual conflicts in multilingual retrieval-augmented generation (RAG), where uniform query processing across languages often leads to suboptimal knowledge integration. To mitigate these issues, the authors propose LcRL, a novel framework that incorporates a language-aware mechanism through language-coupled grouped relative policy optimization, enabling differentiated modeling of languages within both the policy and reward models. The approach employs grouped sampling to alleviate knowledge bias and introduces an anti-consistency penalty term to suppress cross-lingual knowledge conflicts. Experimental results demonstrate that LcRL significantly enhances knowledge acquisition and fusion under data-scarce training conditions and large-scale multilingual retrieval settings, consistently outperforming existing methods.

Technology Category

Application Category

📝 Abstract

Multilingual retrieval-augmented generation (MRAG) requires models to effectively acquire and integrate beneficial external knowledge from multilingual collections. However, most existing studies employ a unitive process where queries of equivalent semantics across different languages are processed through a single-turn retrieval and subsequent optimization. Such a ``one-size-fits-all''strategy is often suboptimal in multilingual settings, as the models occur to knowledge bias and conflict during the interaction with the search engine. To alleviate the issues, we propose LcRL, a multilingual search-augmented reinforcement learning framework that integrates a language-coupled Group Relative Policy Optimization into the policy and reward models. We adopt the language-coupled group sampling in the rollout module to reduce knowledge bias, and regularize an auxiliary anti-consistency penalty in the reward models to mitigate the knowledge conflict. Experimental results demonstrate that LcRL not only achieves competitive performance but is also appropriate for various practical scenarios such as constrained training data and retrieval over collections encompassing a large number of languages. Our code is available at https://github.com/Cherry-qwq/LcRL-Open.

Problem

Research questions and friction points this paper is trying to address.

Multilingual Retrieval-Augmented Generation

Knowledge Bias

Knowledge Conflict

Language-Coupled Learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Language-Coupled Reinforcement Learning

Multilingual Retrieval-Augmented Generation

Group Relative Policy Optimization

Knowledge Bias Mitigation