Mission Impossible: Feedback-Guided Dynamic Interactive Planning for Improving Reasoning on LLMs

📅 2025-10-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the inflexibility of fixed action sequences in open-domain multi-hop reasoning—particularly for complex information retrieval—this paper proposes a dynamic interactive planning framework. The method first constructs an initial reasoning graph via key entity identification, then incorporates feedback-driven node expansion and intra-layer node competition to enable dynamic path adjustment and error backtracking within a depth-first search paradigm. Crucially, it integrates historical error analysis with real-time feedback to establish a closed-loop optimization process. The framework synergistically combines large language models, dynamic node generation, and structured search strategies. Evaluated on HotpotQA and StrategyQA, it achieves F1 scores of 54.47% and 70.05%, respectively—surpassing the best prior baselines by 5.03% and 7.25%. This demonstrates substantial improvements in both reasoning flexibility and accuracy for multi-hop question answering.

Technology Category

Application Category

📝 Abstract
Recent advancements in language agents have led to significant improvements in multi-hop reasoning tasks. However, existing approaches often struggle with handling open-domain problems, which require massive information retrieval due to their reliance on a fixed sequence of actions. To address this, we propose Feedback-Guided Dynamic Interactive Planning (FGDIP), a novel framework tailored to enhance reasoning in LLMs by utilizing dynamic and adaptive strategies for information exploration in open-domain multi-hop reasoning tasks. Our approach begins by identifying key entities relevant to the problem, which serve as the initial nodes in the reasoning process. From these initial nodes, we then generate reasoning child nodes with the process being refined through a combination of historical error analysis and real-time feedback, which allows the framework to dynamically adjust and optimize its reasoning strategies. By integrating depth-first search with an innovative node generation technique, our framework adapts based on both prior error paths and concurrently generated nodes at the same hierarchical level. This dynamic strategy effectively expands the search space while ensuring the reasoning process systematically converges toward accurate solutions. Experimental results show that FGDIP achieved up to 54.47% F1 score on the HotpotQA dataset and 70.05% on the StrategyQA dataset, surpassing the best baseline by 5.03% and 7.25% respectively, highlighting its versatility and potential to enhance language agents in multi-hop reasoning tasks.
Problem

Research questions and friction points this paper is trying to address.

Improves open-domain multi-hop reasoning in LLMs
Addresses limitations of fixed-action sequence approaches
Enhances dynamic information exploration with feedback guidance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic planning with real-time feedback for reasoning
Depth-first search with adaptive node generation technique
Utilizing historical errors to optimize reasoning strategies
🔎 Similar Papers
No similar papers found.
Dong Yan
Dong Yan
AI Chief Expert, Bosch.
Reinforcement LearningFoundation Model
G
Gaochen Wu
Tsinghua University
B
Bowen Zhou
Tsinghua University, Shanghai AI Laboratory