RouteRAG: Efficient Retrieval-Augmented Generation from Text and Graph via Reinforcement Learning

📅 2025-12-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing RAG systems for multi-hop question answering rely on fixed retrieval pipelines, struggle to dynamically integrate graph- and text-based evidence, and incur high computational overhead from graph retrieval. Method: We propose a multi-round adaptive graph-text hybrid RAG framework. (1) An end-to-end reinforcement learning policy jointly optimizes retrieval and generation, dynamically deciding the number of reasoning steps, source modality (graph vs. text), and termination timing. (2) A two-stage PPO training scheme balances answer accuracy and retrieval efficiency. (3) We incorporate graph-structure encoding, multi-step state modeling, and cost-aware reward design. Results: Our method achieves an average 11.2% accuracy gain across five benchmarks while reducing graph retrieval calls by 37%, significantly outperforming state-of-the-art approaches.

Technology Category

Application Category

📝 Abstract
Retrieval-Augmented Generation (RAG) integrates non-parametric knowledge into Large Language Models (LLMs), typically from unstructured texts and structured graphs. While recent progress has advanced text-based RAG to multi-turn reasoning through Reinforcement Learning (RL), extending these advances to hybrid retrieval introduces additional challenges. Existing graph-based or hybrid systems typically depend on fixed or handcrafted retrieval pipelines, lacking the ability to integrate supplementary evidence as reasoning unfolds. Besides, while graph evidence provides relational structures crucial for multi-hop reasoning, it is substantially more expensive to retrieve. To address these limitations, we introduce model{}, an RL-based framework that enables LLMs to perform multi-turn and adaptive graph-text hybrid RAG. model{} jointly optimizes the entire generation process via RL, allowing the model to learn when to reason, what to retrieve from either texts or graphs, and when to produce final answers, all within a unified generation policy. To guide this learning process, we design a two-stage training framework that accounts for both task outcome and retrieval efficiency, enabling the model to exploit hybrid evidence while avoiding unnecessary retrieval overhead. Experimental results across five question answering benchmarks demonstrate that model{} significantly outperforms existing RAG baselines, highlighting the benefits of end-to-end RL in supporting adaptive and efficient retrieval for complex reasoning.
Problem

Research questions and friction points this paper is trying to address.

Enables multi-turn adaptive retrieval from text and graph sources.
Optimizes retrieval decisions via reinforcement learning for efficiency.
Addresses high cost of graph retrieval in hybrid RAG systems.
Innovation

Methods, ideas, or system contributions that make the work stand out.

RL-based multi-turn adaptive hybrid RAG
Unified policy for reasoning, retrieval, and answering
Two-stage training for efficiency and task outcome
🔎 Similar Papers
No similar papers found.