Rank4Gen: RAG-Preference-Aligned Document Set Selection and Ranking

📅 2026-01-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses a critical limitation in existing retrieval-augmented generation (RAG) ranking models, which optimize solely for query-document relevance while ignoring the generator’s personalized preferences for evidence, thereby constraining downstream response quality. To bridge this gap, we propose Rank4Gen—a novel framework that shifts the ranking objective from relevance alignment to response quality alignment. Rank4Gen explicitly models generator-conditioned preferences within a unified architecture, enabling adaptation to diverse generators. Leveraging PRISM, a newly constructed multi-source, multi-generator dataset, our approach employs a response-quality-feedback-driven ranking optimization strategy. Extensive experiments across five mainstream RAG benchmarks demonstrate that Rank4Gen significantly enhances generation quality—particularly in complex evidence composition tasks—while exhibiting strong performance consistency and robustness.

Technology Category

Application Category

📝 Abstract
In the RAG paradigm, the information retrieval module provides context for generators by retrieving and ranking multiple documents to support the aggregation of evidence. However, existing ranking models are primarily optimized for query--document relevance, which often misaligns with generators'preferences for evidence selection and citation, limiting their impact on response quality. Moreover, most approaches do not account for preference differences across generators, resulting in unstable cross-generator performance. We propose \textbf{Rank4Gen}, a generator-aware ranker for RAG that targets the goal of \emph{Ranking for Generators}. Rank4Gen introduces two key preference modeling strategies: (1) \textbf{From Ranking Relevance to Response Quality}, which optimizes ranking with respect to downstream response quality rather than query--document relevance; and (2) \textbf{Generator-Specific Preference Modeling}, which conditions a single ranker on different generators to capture their distinct ranking preferences. To enable such modeling, we construct \textbf{PRISM}, a dataset built from multiple open-source corpora and diverse downstream generators. Experiments on five challenging and recent RAG benchmarks demonstrate that RRank4Gen achieves strong and competitive performance for complex evidence composition in RAG.
Problem

Research questions and friction points this paper is trying to address.

RAG
document ranking
generator preference
response quality
evidence selection
Innovation

Methods, ideas, or system contributions that make the work stand out.

RAG
generator-aware ranking
response quality optimization
preference modeling
document ranking
🔎 Similar Papers
No similar papers found.
Yongqi Fan
Yongqi Fan
East China University of Science and Technology
LLMAI SearchMedical NLPIRAgentic RL
Y
Yuxiang Chu
East China University of Science and Technology, Shanghai, China
Z
Zhentao Xia
East China University of Science and Technology, Shanghai, China
Xiaoyang Chen
Xiaoyang Chen
Chinese Academy of Sciences
Information RetrievalLarge Language Models
J
Jie Liu
Tencent
H
Haijin Liang
Tencent
J
Jin Ma
Tencent
Ben He
Ben He
Professor, University of Chinese Academy of Sciences
Natural Language ProcessingInformation Retrieval
Y
Yingfei Sun
University of Chinese Academy of Sciences
D
Dezhi Ye
Tencent
Tong Ruan
Tong Ruan
East China University of Science and Technology
Clinical NLPLLMKG