🤖 AI Summary
Despite growing interest in leveraging large language models (LLMs) for planning—requiring environmental understanding, logical reasoning, and sequential decision-making—there exists no systematic taxonomy or standardized evaluation framework. Method: This paper introduces the first unified classification scheme for LLM-based planning methods, categorizing existing approaches into three paradigms: external module augmentation, fine-tuning-driven methods, and search-oriented techniques. It further establishes a standardized evaluation framework encompassing benchmark tasks, multidimensional metrics, and empirical comparisons. Contribution/Results: Through comprehensive literature analysis, methodological abstraction, and cross-paradigm mechanistic synthesis, this work delivers the field’s first holistic survey. It clarifies the technical evolution trajectory, identifies core bottlenecks—including scalability, generalization, and causal reasoning—and proposes future directions such as trustworthy planning, embodied collaboration, and neuro-symbolic integration. The study provides an authoritative knowledge graph and methodological roadmap for advancing LLM-based planning research.
📝 Abstract
Planning represents a fundamental capability of intelligent agents, requiring comprehensive environmental understanding, rigorous logical reasoning, and effective sequential decision-making. While Large Language Models (LLMs) have demonstrated remarkable performance on certain planning tasks, their broader application in this domain warrants systematic investigation. This paper presents a comprehensive review of LLM-based planning. Specifically, this survey is structured as follows: First, we establish the theoretical foundations by introducing essential definitions and categories about automated planning. Next, we provide a detailed taxonomy and analysis of contemporary LLM-based planning methodologies, categorizing them into three principal approaches: 1) External Module Augmented Methods that combine LLMs with additional components for planning, 2) Finetuning-based Methods that involve using trajectory data and feedback signals to adjust LLMs in order to improve their planning abilities, and 3) Searching-based Methods that break down complex tasks into simpler components, navigate the planning space, or enhance decoding strategies to find the best solutions. Subsequently, we systematically summarize existing evaluation frameworks, including benchmark datasets, evaluation metrics and performance comparisons between representative planning methods. Finally, we discuss the underlying mechanisms enabling LLM-based planning and outline promising research directions for this rapidly evolving field. We hope this survey will serve as a valuable resource to inspire innovation and drive progress in this field.