🤖 AI Summary
This work addresses a critical limitation in existing retrieval-augmented generation (RAG) ranking models, which optimize solely for query-document relevance while ignoring the generator’s personalized preferences for evidence, thereby constraining downstream response quality. To bridge this gap, we propose Rank4Gen—a novel framework that shifts the ranking objective from relevance alignment to response quality alignment. Rank4Gen explicitly models generator-conditioned preferences within a unified architecture, enabling adaptation to diverse generators. Leveraging PRISM, a newly constructed multi-source, multi-generator dataset, our approach employs a response-quality-feedback-driven ranking optimization strategy. Extensive experiments across five mainstream RAG benchmarks demonstrate that Rank4Gen significantly enhances generation quality—particularly in complex evidence composition tasks—while exhibiting strong performance consistency and robustness.
📝 Abstract
In the RAG paradigm, the information retrieval module provides context for generators by retrieving and ranking multiple documents to support the aggregation of evidence. However, existing ranking models are primarily optimized for query--document relevance, which often misaligns with generators'preferences for evidence selection and citation, limiting their impact on response quality. Moreover, most approaches do not account for preference differences across generators, resulting in unstable cross-generator performance. We propose \textbf{Rank4Gen}, a generator-aware ranker for RAG that targets the goal of \emph{Ranking for Generators}. Rank4Gen introduces two key preference modeling strategies: (1) \textbf{From Ranking Relevance to Response Quality}, which optimizes ranking with respect to downstream response quality rather than query--document relevance; and (2) \textbf{Generator-Specific Preference Modeling}, which conditions a single ranker on different generators to capture their distinct ranking preferences. To enable such modeling, we construct \textbf{PRISM}, a dataset built from multiple open-source corpora and diverse downstream generators. Experiments on five challenging and recent RAG benchmarks demonstrate that RRank4Gen achieves strong and competitive performance for complex evidence composition in RAG.