🤖 AI Summary
In complex tasks, static task decomposition leads to context fragmentation, causing information loss and subtask execution failure. Method: This paper proposes a training-free, interactive LLM framework that replaces static decomposition with a subtask trajectory memory mechanism and refined execution summarization, enabling dynamic reactivation and precise reuse of completed subtasks; it further introduces a lightweight request-response protocol to support proactive, on-demand contextual interaction among subtasks. The method leverages GPT-3.5/GPT-4 for zero-shot inference without fine-tuning or additional parameters. Results: Experiments on WebShop and HotpotQA demonstrate substantial improvements over existing zero-shot baselines, achieving significant gains in multi-hop reasoning and interactive decision-making accuracy. To our knowledge, this is the first approach to realize dynamic, context-aware collaborative reasoning without any training.
📝 Abstract
Large language models (LLMs) have shown remarkable capabilities in solving complex tasks. Recent work has explored decomposing such tasks into subtasks with independent contexts. However, some contextually related subtasks may encounter information loss during execution, leading to redundant operations or execution failures. To address this issue, we propose a training-free framework with an interaction mechanism, which enables a subtask to query specific information or trigger certain actions in completed subtasks by sending requests. To implement interaction, we introduce a subtask trajectory memory to enable resumption of completed subtasks upon receiving interaction requests. Additionally, we propose a new action during execution, which generates a concise and precise description of execution process and outcomes of a subtask, to assist subsequent subtasks in determining interaction targets and requests. We evaluate our framework on interactive decision-making task WebShop and multi-hop question answering HotpotQA, with GPT-3.5 and GPT-4, and comparison results show that our framework outperforms the state-of-the-art training-free baselines.