🤖 AI Summary
This work addresses the limited planning generalization of large language model (LLM) agents in unseen scenarios by proposing a dynamic policy learning framework that integrates generalized planning with hierarchical task decomposition. The approach automatically extracts and reuses parameterized policy components from successful executions to construct a composable policy library. Central to the method are hierarchical component learning (HCL-GP), semantic-driven policy retrieval, and a dynamic reuse mechanism that enables cross-task knowledge transfer. Evaluated on the AppWorld benchmark, the proposed method achieves task success rates of 98.2% on standard tasks and 97.8% on challenging ones—representing a 15.8 percentage point improvement over static composition. Notably, it elevates the success rate of open-source LLM agents from near zero to 62.5%, substantially enhancing their task generalization capabilities.
📝 Abstract
We present a dynamic policy-learning approach that combines generalized planning and hierarchical task decomposition for LLM-based agents. Our method, Hierarchical Component Learning for Generalized Policies (HCL-GP ), learns parameterized policies that generalize across task instances and automatically extracts reusable components from successful executions, organizing them into a component library for compositional policy generation. We address three challenges: (1) learning components through automated decomposition, (2) generalizing components to maximize reuse, and (3) efficient retrieval via semantic search. Evaluated on the AppWorld benchmark, our approach achieves 98.2% accuracy on normal tasks and 97.8% on challenge tasks with unseen applications, improving 15.8 points over static synthesis on challenging scenarios. For open-source models, dynamic reuse enables 62.5% success versus near-zero without reuse. This demonstrates that classical planning concepts can be effectively integrated with LLM agents for improved accuracy and efficiency.