Beyond Entangled Planning: Task-Decoupled Planning for Long-Horizon Agents

📅 2026-01-12
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges faced by long-horizon agents in complex tasks, where entangled global context leads to high cognitive load, rapid error propagation, and costly recovery. To mitigate these issues, the paper proposes a Task-Decoupled Planning (TDP) framework that introduces, for the first time, a training-agnostic task decoupling mechanism. Specifically, a supervisor decomposes the task into a directed acyclic graph of subgoals, enabling a planner and executor to perform localized reasoning and replanning within bounded scopes, thereby isolating subtasks. This approach significantly enhances robustness and efficiency in long-horizon settings, outperforming strong baselines on TravelPlanner, ScienceWorld, and HotpotQA while reducing token consumption by up to 82%.

Technology Category

Application Category

📝 Abstract
Recent advances in large language models (LLMs) have enabled agents to autonomously execute complex, long-horizon tasks, yet planning remains a primary bottleneck for reliable task execution. Existing methods typically fall into two paradigms: step-wise planning, which is reactive but often short-sighted; and one-shot planning, which generates a complete plan upfront yet is brittle to execution errors. Crucially, both paradigms suffer from entangled contexts, where the agent must reason over a monolithic history spanning multiple sub-tasks. This entanglement increases cognitive load and lets local errors propagate across otherwise independent decisions, making recovery computationally expensive. To address this, we propose Task-Decoupled Planning (TDP), a training-free framework that replaces entangled reasoning with task decoupling. TDP decomposes tasks into a directed acyclic graph (DAG) of sub-goals via a Supervisor. Using a Planner and Executor with scoped contexts, TDP confines reasoning and replanning to the active sub-task. This isolation prevents error propagation and corrects deviations locally without disrupting the workflow. Results on TravelPlanner, ScienceWorld, and HotpotQA show that TDP outperforms strong baselines while reducing token consumption by up to 82%, demonstrating that sub-task decoupling improves both robustness and efficiency for long-horizon agents.
Problem

Research questions and friction points this paper is trying to address.

entangled planning
long-horizon agents
task decomposition
error propagation
context management
Innovation

Methods, ideas, or system contributions that make the work stand out.

Task-Decoupled Planning
Long-Horizon Agents
Directed Acyclic Graph (DAG)
Scoped Context
Error Propagation