$ ext{R}^2 ext{R}$: A Route-to-Rerank Post-Training Framework for Multi-Domain Decoder-Only Rerankers

📅 2025-11-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address poor generalization and catastrophic forgetting of decoder-specific re-rankers in multi-domain RAG, this paper proposes a domain-aware dynamic expert re-ranking framework. Methodologically, it introduces: (1) an Entity Abstraction for Generalization (EAG) strategy that masks superficial cues via entity abstraction to mitigate overfitting; (2) a lightweight latent semantic router that dynamically activates LoRA-based expert modules, enabling model-agnostic, modular cross-domain adaptation; and (3) a two-stage training protocol—freezing the backbone decoder’s representation learning while jointly optimizing the router and experts. Evaluated across legal, financial, and medical domains, the method significantly outperforms both generic and single-domain fine-tuned baselines. It demonstrates strong cross-domain robustness and plug-and-play compatibility, achieving effective zero-shot transfer without architectural modification or domain-specific retraining.

Technology Category

Application Category

📝 Abstract
Decoder-only rerankers are central to Retrieval-Augmented Generation (RAG). However, generalist models miss domain-specific nuances in high-stakes fields like finance and law, and naive fine-tuning causes surface-form overfitting and catastrophic forgetting. To address this challenge, we introduce R2R, a domain-aware framework that combines dynamic expert routing with a two-stage training strategy, Entity Abstraction for Generalization (EAG). EAG introduces a counter-shortcut mechanism by masking the most predictive surface cues, forcing the reranker to learn domain-invariant relevance patterns rather than memorizing dataset-specific entities. To efficiently activate domain experts, R2R employs a lightweight Latent Semantic Router that probes internal representations from the frozen backbone decoder to select the optimal LoRA expert per query. Extensive experiments across different reranker backbones and diverse domains (legal, medical, and financial) demonstrate that R2R consistently surpasses generalist and single-domain fine-tuned baselines. Our results confirm that R2R is a model-agnostic and modular approach to domain specialization with strong cross-domain robustness.
Problem

Research questions and friction points this paper is trying to address.

Decoder-only rerankers miss domain-specific nuances in high-stakes fields
Naive fine-tuning causes surface-form overfitting and catastrophic forgetting
Generalist models lack cross-domain robustness for specialized domains
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic expert routing with two-stage training strategy
Counter-shortcut mechanism masking predictive surface cues
Latent Semantic Router selecting optimal LoRA expert
X
Xinyu Wang
SimpleWay.AI
H
Hanwei Wu
SimpleWay.AI
Q
Qingchen Hu
SimpleWay.AI
Zhenghan Tai
Zhenghan Tai
University of Toronto
Information RetrievalLarge Language ModelRetrieval Augmented Generation
J
Jingrui Tian
SimpleWay.AI
L
Lei Ding
SimpleWay.AI
J
Jijun Chi
SimpleWay.AI
Hailin He
Hailin He
Unknown affiliation
T
Tung Sum Thomas Kwok
SimpleWay.AI
Yufei Cui
Yufei Cui
McGill University, MILA
Medical AIRAGLLM AgentPredictive Uncertainty
S
Sicheng Lyu
SimpleWay.AI
Muzhi Li
Muzhi Li
The Chinese University of Hong Kong
Knowledge GraphNatural Language Processing
M
Mingze Li
Université de Montréal
X
Xinyue Yu
SimpleWay.AI
L
Ling Zhou
CG Matrix
P
Peng Lu
Université de Montréal