🤖 AI Summary
Current LLM-based agents struggle with complex planning tasks due to monolithic memory mechanisms, which fail to jointly ensure experience quality, knowledge diversity, and dynamic adaptability. To address this, we propose a “coarse-to-fine” hierarchical embodied memory framework: first, coarse-grained environmental focus guides high-quality experience acquisition; subsequently, mixed-granularity executable prompts are generated to support task reasoning, anomaly detection, self-reflection, and plan revision. The framework integrates offline experience storage, online trajectory analysis, memory retrieval, and a self-question-answering mechanism to achieve embodied grounding of multi-granular information. Experiments across diverse complex planning benchmarks demonstrate significant performance gains—particularly under environmental anomalies—where our approach exhibits superior robustness, adaptability, and error-correction capability compared to prior methods.
📝 Abstract
Recent advancements in Large Language Models (LLMs) have driven growing interest in LLM-based agents for complex planning tasks. To avoid costly agent training, many studies adopted memory mechanism that enhances LLM with offline experiences or online trajectory analysis. However, existing works focus on single-granularity memory derived from dynamic environmental interactions, which are inherently constrained by the quality of the collected experiences. This limitation, in turn, constrain the diversity of knowledge and the flexibility of planning. We propose Coarse-to-Fine Grounded Memory (Ours{}), a novel framework that grounds coarse-to-fine memories with LLM, thereby fully leverage them for flexible adaptation to diverse scenarios. Ours{} grounds environmental information into coarse-grained focus points to guide experience collection in training tasks, followed by grounding of actionable hybrid-grained tips from each experience. At inference, Ours{} retrieves task-relevant experiences and tips to support planning. When facing environmental anomalies, the LLM grounds the current situation into fine-grained key information, enabling flexible self-QA reflection and plan correction.