🤖 AI Summary
Traditional RAG systems for complex question answering rely on static retrieval and struggle to support multi-step, adaptive search. Method: This paper proposes ReSearch—a fully end-to-end trainable information-seeking agent architecture that tightly integrates retrieval-augmented generation (RAG) with agent-based search. Its core innovation is a fine-grained, stepwise supervision framework combining process reward modeling, discriminative LLM-based reward learning, and a transferable reward model to jointly optimize query generation and answer reasoning. Contribution/Results: ReSearch significantly enhances the multi-step retrieval–reasoning closed-loop capability. Evaluated on four challenging benchmarks, it achieves an average performance gain of 25.6% over state-of-the-art baselines. Empirical results further demonstrate the generalizability and scalability of its reward model across diverse LLMs.
📝 Abstract
Retrieval-augmented generation (RAG) has shown great potential for knowledge-intensive tasks, but its traditional architectures rely on static retrieval, limiting their effectiveness for complex questions that require sequential information-seeking. While agentic reasoning and search offer a more adaptive approach, most existing methods depend heavily on prompt engineering. In this work, we introduce RAG-Gym, a unified optimization framework that enhances information-seeking agents through fine-grained process supervision at each search step. We also propose ReSearch, a novel agent architecture that synergizes answer reasoning and search query generation within the RAG-Gym framework. Experiments on four challenging datasets show that RAG-Gym improves performance by up to 25.6% across various agent architectures, with ReSearch consistently outperforming existing baselines. Further analysis highlights the effectiveness of advanced LLMs as process reward judges and the transferability of trained reward models as verifiers for different LLMs. Additionally, we examine the scaling properties of training and inference in agentic RAG. The project homepage is available at https://rag-gym.github.io/.