From Evidence to Trajectory: Abductive Reasoning Path Synthesis for Training Retrieval-Augmented Generation Agents

📅 2025-09-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Retrieval-augmented generation (RAG) agents lack process-level supervision for task decomposition, tool invocation, and stepwise reasoning, limiting their capability in complex reasoning and tool utilization. To address this, we propose EviPath—a novel framework that introduces abductive reasoning into subtask planning for the first time, enabling evidence-anchored, interpretable reasoning paths. EviPath synthesizes high-quality, end-to-end supervision data via agent-environment interaction simulation, evidence-driven faithful question answering, and conversational fine-tuning. This facilitates fine-grained modeling and traceable training of agent behavior chains. Evaluated on multiple open-domain question-answering benchmarks, an 8B-parameter model powered by EviPath surpasses state-of-the-art methods—achieving an absolute 14.7% improvement in Exact Match (EM), demonstrating both significant performance gains and robustness.

Technology Category

Application Category

📝 Abstract
Retrieval-augmented generation agents development is hindered by the lack of process-level supervision to effectively guide agentic capabilities like task decomposition, retriever invocation, and stepwise decision-making. While reinforcement learning offers a potential solution, it suffers from sparse rewards and the limited reasoning capabilities of large language models (LLMs). Meanwhile, existing data synthesis methods only produce chain-of-thought rationales and fail to model environmental interactions. In this paper, we propose EviPath, an evidence-anchored reasoning path synthesis paradigm for RAG agent development. EviPath comprises: (i) Abductive Subtask Planning, which decomposes the problem into sub-questions and iteratively plans an optimal solution path based on the dependencies between them; (ii) Faithful Sub-question Answering, which uses supporting evidence to construct a proxy environment to generate reasoning thoughts and answers for each sub-question; and (iii) Conversational Fine-Tuning, which formats the complete agent-environment interaction trajectory into a dialogue format suitable for Supervised Fine-Tuning. EviPath allows LLMs to learn complex reasoning and tool-use capabilities directly from synthesized data. Extensive experiments on widely-used question-answering benchmarks show that an 8B parameter model trained with EviPath-synthesized data significantly and consistently outperforms state-of-the-art baselines with a double-digit absolute EM gain of 14.7% in open-domain question answering.
Problem

Research questions and friction points this paper is trying to address.

Developing RAG agents lacks process-level supervision for task decomposition
Reinforcement learning suffers from sparse rewards and limited LLM reasoning
Existing synthesis methods fail to model environmental interactions for agents
Innovation

Methods, ideas, or system contributions that make the work stand out.

Abductive subtask planning for optimal solution paths
Evidence-based proxy environment for faithful reasoning
Conversational fine-tuning for agent-environment interaction training
🔎 Similar Papers
No similar papers found.
Muzhi Li
Muzhi Li
The Chinese University of Hong Kong
Knowledge GraphNatural Language Processing
Jinhu Qi
Jinhu Qi
PhD candidate in CUHK CSE
Agentic AILLMsReasoning
Y
Yihong Wu
Université de Montréal, Montréal, Québec, Canada
M
Minghao Zhao
The Chinese University of Hong Kong, Sha Tin, NT, Hong Kong
Liheng Ma
Liheng Ma
PhD student, McGill University & Mila.
Geometric Deep LearningGraph Neural NetworksTime SeriesMachine Learning
Y
Yifan Li
The Chinese University of Hong Kong, Sha Tin, NT, Hong Kong
X
Xinyu Wang
McGill University, Montréal, Québec, Canada
Y
Yingxue Zhang
Huawei Noah’s Ark Lab, Montréal, Québec, Canada
Ho-fung Leung
Ho-fung Leung
Independent Researcher
Irwin King
Irwin King
The Chinese University of Hong Kong
social computingmachine learningAIgraph neural networksNLP