🤖 AI Summary
ObjectNav faces challenges in dynamic, long-horizon tasks due to long-term memory decay and insufficient spatial reasoning. To address this, we propose a spatial memory framework built upon an updatable topological graph: the graph serves as a unified representation integrating real-time visual observations with persistent semantic-structural knowledge, enabling joint modeling of adjacency, connectivity, and scene semantics; a dedicated graph update mechanism supports incremental map construction and goal-directed path re-planning under environmental dynamics. Evaluated on standard ObjectNav benchmarks—including AI2THOR and RoboTHOR—our method achieves state-of-the-art performance, improving task success rate by +8.2% and path efficiency (SPL) by 23.5%, with particularly strong generalization in multi-room and highly dynamic environments. The core innovation lies in elevating the topological graph from a static map to an active spatial cognitive hub capable of memory integration and relational reasoning.
📝 Abstract
Object Navigation (ObjectNav) has made great progress with large language models (LLMs), but still faces challenges in memory management, especially in long-horizon tasks and dynamic scenes. To address this, we propose TopoNav, a new framework that leverages topological structures as spatial memory. By building and updating a topological graph that captures scene connections, adjacency, and semantic meaning, TopoNav helps agents accumulate spatial knowledge over time, retrieve key information, and reason effectively toward distant goals. Our experiments show that TopoNav achieves state-of-the-art performance on benchmark ObjectNav datasets, with higher success rates and more efficient paths. It particularly excels in diverse and complex environments, as it connects temporary visual inputs with lasting spatial understanding.