DecEx-RAG: Boosting Agentic Retrieval-Augmented Generation with Decision and Execution Optimization via Process Supervision

📅 2025-10-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address low exploration efficiency, sparse rewards, and ambiguous global feedback in Agentic Retrieval-Augmented Generation (RAG) for complex tasks, this paper formalizes RAG as a two-stage decision-execution Markov Decision Process and proposes the DecEx-RAG framework. Its key contributions are: (1) process-level policy optimization with fine-grained procedural supervision, enabling autonomous task decomposition and dynamic retrieval; (2) an efficient pruning strategy to enhance both data expansion quality and construction speed; and (3) integration of reinforcement learning with a dynamic workflow mechanism. Evaluated on six benchmark datasets, DecEx-RAG achieves an average absolute performance gain of 6.2% and improves data construction efficiency by 5.8× over state-of-the-art baselines.

Technology Category

Application Category

📝 Abstract
Agentic Retrieval-Augmented Generation (Agentic RAG) enhances the processing capability for complex tasks through dynamic retrieval and adaptive workflows. Recent advances (e.g., Search-R1) have shown that outcome-supervised reinforcement learning demonstrate strong performance. However, this approach still suffers from inefficient exploration, sparse reward signals, and ambiguous global reward feedback. To address these challenges, we propose DecEx-RAG, which models RAG as a Markov Decision Process (MDP) incorporating decision-making and execution, while introducing an efficient pruning strategy to optimize data expansion. Through comprehensive process-level policy optimization, DecEx-RAG significantly enhances the autonomous task decomposition, dynamic retrieval, and high-quality answer generation capabilities of large language models (LLMs). Experiments show that DecEx-RAG achieves an average absolute performance improvement of $6.2%$ across six datasets, significantly outperforming existing baselines. Moreover, the pruning strategy improves data construction efficiency by nearly $6 imes$, providing an efficient solution for process-supervised RAG training. The code is available at https://github.com/sdsxdxl/DecEx-RAG.
Problem

Research questions and friction points this paper is trying to address.

Optimizes exploration and reward signals in agentic RAG
Enhances autonomous task decomposition and dynamic retrieval
Improves data construction efficiency via pruning strategy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Models RAG as Markov Decision Process with decision execution
Introduces pruning strategy to optimize data expansion efficiency
Uses process-level policy optimization for autonomous task decomposition
🔎 Similar Papers
No similar papers found.
Y
Yongqi Leng
TJUNLP Lab, College of Intelligence and Computing, Tianjin University, Tianjin, China
Y
Yikun Lei
Xiaohongshu Inc.
X
Xikai Liu
Xiaohongshu Inc.
M
Meizhi Zhong
Xiaohongshu Inc.
B
Bojian Xiong
TJUNLP Lab, College of Intelligence and Computing, Tianjin University, Tianjin, China
Y
Yurong Zhang
Xiaohongshu Inc.
Y
Yan Gao
Xiaohongshu Inc.
Y
Yi Wu
Xiaohongshu Inc.
Yao Hu
Yao Hu
浙江大学
Machine Learning
Deyi Xiong
Deyi Xiong
Professor, College of Intelligence and Computing, Tianjin University, China
Natural Language ProcessingLarge Language ModelsAI4Science