🤖 AI Summary
To address the critical trade-off between static retrieval depth (k) and re-ranking efficacy in Retrieval-Augmented Generation (RAG)—where fixed k values lead to either omission of key evidence or injection of noise—this paper proposes the first reinforcement learning–based dynamic re-ranking framework. Our method treats LLM generation quality as a differentiable reward signal, enabling an intelligent re-ranker to autonomously determine the optimal k per query and perform query-aware document re-ranking. Integrated within a retrieval–re-ranking–generation co-optimization architecture, it establishes an end-to-end quality feedback loop. Evaluated across seven knowledge-intensive benchmarks, our approach achieves state-of-the-art performance, significantly improving generation accuracy, faithfulness, and interpretability. All code, models, and datasets are publicly released.
📝 Abstract
Retrieval-augmented generation (RAG) systems combine large language models (LLMs) with external knowledge retrieval, making them highly effective for knowledge-intensive tasks. A crucial but often under-explored component of these systems is the reranker, which refines retrieved documents to enhance generation quality and explainability. The challenge of selecting the optimal number of documents (k) remains unsolved: too few may omit critical information, while too many introduce noise and inefficiencies. Although recent studies have explored LLM-based rerankers, they primarily leverage internal model knowledge and overlook the rich supervisory signals that LLMs can provide, such as using response quality as feedback for optimizing reranking decisions. In this paper, we propose DynamicRAG, a novel RAG framework where the reranker dynamically adjusts both the order and number of retrieved documents based on the query. We model the reranker as an agent optimized through reinforcement learning (RL), using rewards derived from LLM output quality. Across seven knowledge-intensive datasets, DynamicRAG demonstrates superior performance, achieving state-of-the-art results. The model, data and code are available at https://github.com/GasolSun36/DynamicRAG