LODGE: Joint Hierarchical Task Planning and Learning of Domain Models with Grounded Execution

📅 2025-05-15

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

Large language models (LLMs) frequently generate inexecutable plans for long-horizon task planning, while existing domain modeling approaches rely heavily on manual feedback. This paper proposes a simulation-driven, hierarchical domain modeling and planning co-learning framework: it automatically abstracts low-level predicates and actions into semantically coherent high-level concepts via symbolic inductive learning and simulation interaction, yielding verifiable, multi-granularity PDDL models; a centralized error-reasoning mechanism ensures cross-level planning consistency. The method integrates LLM prompting, classical planning (FF planner), and grounded execution verification into an end-to-end closed loop. Evaluated on IPC benchmarks and robotic manipulation tasks, it achieves significantly higher planning success rates than state-of-the-art domain synthesis and LLM-modulo approaches, and produces high-fidelity, transferable hierarchical domain models.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) enable planning from natural language instructions using implicit world knowledge, but often produce flawed plans that require refinement. Instead of directly predicting plans, recent methods aim to learn a problem domain that can be solved for different goal states using classical planners. However, these approaches require significant human feedback to obtain useful models. We address this shortcoming by learning hierarchical domains, where low-level predicates and actions are composed into higher-level counterparts, and by leveraging simulation to validate their preconditions and effects. This hierarchical approach is particularly powerful for long-horizon planning, where LLM-based planning approaches typically struggle. Furthermore, we introduce a central error reasoner to ensure consistency among the different planning levels. Evaluation on two challenging International Planning Competition (IPC) domains and a long-horizon robot manipulation task demonstrates higher planning success rates than state-of-the-art domain synthesis and LLM-modulo planning methods, while constructing high-quality models of the domain. Resources, videos and detailed experiment results are available at https://claudius-kienle.github.io/lodge/.

Problem

Research questions and friction points this paper is trying to address.

Improving flawed plans from LLMs via hierarchical domain learning

Reducing human feedback needs with simulation-validated hierarchical models

Enhancing long-horizon planning success via multi-level consistency reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Learning hierarchical domains for long-horizon planning

Using simulation to validate preconditions and effects

Introducing error reasoner for planning consistency

🔎 Similar Papers

Scalable Task Planning via Large Language Models and Structured World Representations