SmartRAG: Jointly Learn RAG-Related Tasks From the Environment Feedback

📅 2024-10-22
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Existing RAG systems train retrieval, query rewriting, and generation modules separately, leading to suboptimal end-to-end coordination. Method: We propose an end-to-end trainable RAG framework that unifies all three components as a single policy network agent; the retriever and policy are jointly optimized via differentiable retrieval interfaces and reinforcement learning. Crucially, we introduce the first environment-feedback-driven, cost-efficient retrieval paradigm, where reward shaping explicitly balances answer accuracy against retrieval cost (e.g., number of retrieved passages). Contribution/Results: Evaluated on multiple open-domain question answering benchmarks, our method significantly outperforms modularly trained baselines—achieving higher answer accuracy while reducing average retrieval count. This demonstrates that joint optimization delivers substantial, holistic performance gains for RAG systems.

Technology Category

Application Category

📝 Abstract
RAG systems consist of multiple modules to work together. However, these modules are usually separately trained. We argue that a system like RAG that incorporates multiple modules should be jointly optimized to achieve optimal performance. To demonstrate this, we design a specific pipeline called extbf{SmartRAG} that includes a policy network and a retriever. The policy network can serve as 1) a decision maker that decides when to retrieve, 2) a query rewriter to generate a query most suited to the retriever, and 3) an answer generator that produces the final response with/without the observations. We then propose to jointly optimize the whole system using a reinforcement learning algorithm, with the reward designed to encourage the system to achieve the best performance with minimal retrieval cost. When jointly optimized, all the modules can be aware of how other modules are working and thus find the best way to work together as a complete system. Empirical results demonstrate that the jointly optimized SmartRAG can achieve better performance than separately optimized counterparts.
Problem

Research questions and friction points this paper is trying to address.

Joint optimization of RAG system modules for better performance.
Integration of policy network for retrieval, query rewriting, and answer generation.
Reinforcement learning to minimize retrieval cost while maximizing system efficiency.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Jointly optimizes RAG modules using reinforcement learning
Policy network integrates retrieval, rewriting, and response generation
Minimizes retrieval cost while maximizing system performance
🔎 Similar Papers
No similar papers found.
J
Jingsheng Gao
Shanghai Jiao Tong University
L
Linxu Li
Xiaobing.AI
Weiyuan Li
Weiyuan Li
Alibaba Group
RLLLMAgent
Y
Yuzhuo Fu
Shanghai Jiao Tong University
B
Bin Dai
Xiaobing.AI