🤖 AI Summary
Large language model (LLM) agents often lose focus and deviate from objectives in complex, long-horizon decision-making tasks due to insufficient high-level planning guidance and dynamic execution monitoring. To address this, we propose HiPlan, a hierarchical planning framework that enables macro-fine coordination via global milestone navigation and adaptive local step generation. Our key innovation is a retrievable milestone library constructed from expert demonstrations, enabling cross-task experience reuse. HiPlan integrates semantic retrieval, task decomposition, trajectory snippet reuse, and online prompt generation within a two-phase paradigm: offline library construction and online planning. Evaluated on two challenging benchmarks, HiPlan significantly outperforms strong baselines. Ablation studies confirm the complementary roles and individual effectiveness of each component.
📝 Abstract
Large language model (LLM)-based agents have demonstrated remarkable capabilities in decision-making tasks, but struggle significantly with complex, long-horizon planning scenarios. This arises from their lack of macroscopic guidance, causing disorientation and failures in complex tasks, as well as insufficient continuous oversight during execution, rendering them unresponsive to environmental changes and prone to deviations. To tackle these challenges, we introduce HiPlan, a hierarchical planning framework that provides adaptive global-local guidance to boost LLM-based agents'decision-making. HiPlan decomposes complex tasks into milestone action guides for general direction and step-wise hints for detailed actions. During the offline phase, we construct a milestone library from expert demonstrations, enabling structured experience reuse by retrieving semantically similar tasks and milestones. In the execution phase, trajectory segments from past milestones are dynamically adapted to generate step-wise hints that align current observations with the milestone objectives, bridging gaps and correcting deviations. Extensive experiments across two challenging benchmarks demonstrate that HiPlan substantially outperforms strong baselines, and ablation studies validate the complementary benefits of its hierarchical components.