🤖 AI Summary
To address insufficient decision robustness and adaptability in long-horizon embodied task planning within dynamic environments, this paper proposes a novel framework integrating spatiotemporal memory, dynamic knowledge graphs, and planner-critic co-optimization. We introduce an updatable spatiotemporal memory module and an event-driven knowledge graph to enable continuous environmental perception and historical experience modeling. Furthermore, we design a reinforcement learning–driven iterative planner-critic architecture that establishes a closed-loop optimization between policy generation and real-time evaluation. Evaluated on 32 complex multi-step tasks in TextWorld, our method achieves a 31.25% improvement in task success rate and a 24.7% increase in average score over prior state-of-the-art approaches. These results demonstrate substantial gains in memory persistence, spatial reasoning capability, and online adaptation—highlighting the framework’s holistic advantages for dynamic embodied AI.
📝 Abstract
A key objective of embodied intelligence is enabling agents to perform long-horizon tasks in dynamic environments while maintaining robust decision-making and adaptability. To achieve this goal, we propose the Spatio-Temporal Memory Agent (STMA), a novel framework designed to enhance task planning and execution by integrating spatio-temporal memory. STMA is built upon three critical components: (1) a spatio-temporal memory module that captures historical and environmental changes in real time, (2) a dynamic knowledge graph that facilitates adaptive spatial reasoning, and (3) a planner-critic mechanism that iteratively refines task strategies. We evaluate STMA in the TextWorld environment on 32 tasks, involving multi-step planning and exploration under varying levels of complexity. Experimental results demonstrate that STMA achieves a 31.25% improvement in success rate and a 24.7% increase in average score compared to the state-of-the-art model. The results highlight the effectiveness of spatio-temporal memory in advancing the memory capabilities of embodied agents.