LLMs Can Plan Only If We Tell Them

📅 2025-01-23

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

Large language models (LLMs) exhibit limited long-horizon planning capability without external feedback, particularly suffering from low resource efficiency and subhuman plan quality. To address this, we propose AoT+, an extension of the Algorithm-of-Thoughts framework that integrates structured reasoning modeling and step-wise self-validation, coupled with implicit state representation and multi-step backtracking constraints. AoT+ enables fully internal, tool-free, and re-prompting-free autonomous planning—requiring no external environment interaction or human intervention. For the first time, it empowers models such as GPT-4 to surpass average human performance autonomously on classical planning benchmarks (e.g., Blocksworld), achieving state-of-the-art accuracy and executable plan validity. Our core contribution is the establishment of the first LLM reasoning paradigm capable of stably generating high-quality, long-sequence operational plans without any external feedback.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) have demonstrated significant capabilities in natural language processing and reasoning, yet their effectiveness in autonomous planning has been under debate. While existing studies have utilized LLMs with external feedback mechanisms or in controlled environments for planning, these approaches often involve substantial computational and development resources due to the requirement for careful design and iterative backprompting. Moreover, even the most advanced LLMs like GPT-4 struggle to match human performance on standard planning benchmarks, such as the Blocksworld, without additional support. This paper investigates whether LLMs can independently generate long-horizon plans that rival human baselines. Our novel enhancements to Algorithm-of-Thoughts (AoT), which we dub AoT+, help achieve state-of-the-art results in planning benchmarks out-competing prior methods and human baselines all autonomously.

Problem

Research questions and friction points this paper is trying to address.

Long-term Planning

Resource Consumption

Efficiency Issues

Innovation

Methods, ideas, or system contributions that make the work stand out.

AoT+

Long-term Planning

Unassisted Performance

🔎 Similar Papers

No similar papers found.