Exploratory Retrieval-Augmented Planning For Continual Embodied Instruction Following

📅 2025-09-09

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

In dynamic, non-stationary environments, embodied agents face challenges in grounding continual instruction-following task planning and sustaining reliable memory. To address these issues, this paper proposes the Exploratory Retrieval-Augmented Planning (ERAP) framework. ERAP jointly integrates information-gain-driven active exploration with temporal consistency optimization to construct a lightweight, time-sensitive environmental context memory, enabling dynamic grounding of task planning via retrieval augmentation. Its core techniques include: large language model–driven plan generation, context-aware memory updating, information-gain-based exploration scheduling, retrieval-augmented query evaluation, and temporal consistency refinement. Experiments on VirtualHome, ALFRED, and CARLA demonstrate that ERAP significantly improves task success rates and robustness in multi-task sequential execution, consistently outperforming state-of-the-art methods across varying instruction scales and degrees of environmental non-stationarity.

Technology Category

Application Category

📝 Abstract

This study presents an Exploratory Retrieval-Augmented Planning (ExRAP) framework, designed to tackle continual instruction following tasks of embodied agents in dynamic, non-stationary environments. The framework enhances Large Language Models' (LLMs) embodied reasoning capabilities by efficiently exploring the physical environment and establishing the environmental context memory, thereby effectively grounding the task planning process in time-varying environment contexts. In ExRAP, given multiple continual instruction following tasks, each instruction is decomposed into queries on the environmental context memory and task executions conditioned on the query results. To efficiently handle these multiple tasks that are performed continuously and simultaneously, we implement an exploration-integrated task planning scheme by incorporating the {information-based exploration} into the LLM-based planning process. Combined with memory-augmented query evaluation, this integrated scheme not only allows for a better balance between the validity of the environmental context memory and the load of environment exploration, but also improves overall task performance. Furthermore, we devise a {temporal consistency refinement} scheme for query evaluation to address the inherent decay of knowledge in the memory. Through experiments with VirtualHome, ALFRED, and CARLA, our approach demonstrates robustness against a variety of embodied instruction following scenarios involving different instruction scales and types, and non-stationarity degrees, and it consistently outperforms other state-of-the-art LLM-based task planning approaches in terms of both goal success rate and execution efficiency.

Problem

Research questions and friction points this paper is trying to address.

Enhancing embodied agents' continual instruction following in dynamic environments

Improving task planning by integrating exploration with environmental context memory

Addressing knowledge decay in memory for robust performance across scenarios

Innovation

Methods, ideas, or system contributions that make the work stand out.

Exploratory Retrieval-Augmented Planning framework

Information-based exploration in LLM planning

Temporal consistency refinement for memory

🔎 Similar Papers

Socratic Planner: Inquiry-Based Zero-Shot Planning for Embodied Instruction Following