CAUSALNAV: A Long-Term Embodied Navigation System for Autonomous Mobile Robots in Dynamic Outdoor Scenarios

📅 2026-01-05
🏛️ IEEE Robotics and Automation Letters
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of language-guided navigation in large-scale dynamic outdoor environments—namely, difficulties in semantic reasoning, high environmental dynamism, and poor long-term stability—by proposing a multi-level semantic scene graph navigation framework that integrates offline maps with real-time perception. It pioneers the integration of large language models with embodied scene graphs to construct a temporally updatable dynamic graph structure that explicitly models moving objects and enables multi-granular spatial reasoning and hierarchical planning under open-vocabulary queries. By synergizing retrieval-augmented generation with graph-based reasoning, the method achieves efficient long-horizon semantic navigation. Experiments demonstrate significant improvements in navigation robustness, efficiency, and long-term performance in both simulated and real-world dynamic outdoor settings.

Technology Category

Application Category

📝 Abstract
Autonomous language-guided navigation in large-scale outdoor environments remains a key challenge in mobile robotics, due to difficulties in semantic reasoning, dynamic conditions, and long-term stability. We propose CausalNav, the first scene graph-based semantic navigation framework tailored for dynamic outdoor environments. We construct a multi-level semantic scene graph using LLMs, referred to as the Embodied Graph, that hierarchically integrates coarse-grained map data with fine-grained object entities. The constructed graph serves as a retrievable knowledge base for Retrieval-Augmented Generation (RAG), enabling semantic navigation and long-range planning under open-vocabulary queries. By fusing real-time perception with offline map data, the Embodied Graph supports robust navigation across varying spatial granularities in dynamic outdoor environments. Dynamic objects are explicitly handled in both the scene graph construction and hierarchical planning modules. The Embodied Graph is continuously updated within a temporal window to reflect environmental changes and support real-time semantic navigation. Extensive experiments in both simulation and real-world settings demonstrate superior robustness and efficiency.
Problem

Research questions and friction points this paper is trying to address.

autonomous navigation
dynamic outdoor environments
semantic reasoning
long-term stability
language-guided navigation
Innovation

Methods, ideas, or system contributions that make the work stand out.

scene graph
embodied navigation
retrieval-augmented generation
dynamic outdoor environments
large language models
🔎 Similar Papers
No similar papers found.
H
Hongbo Duan
Center for Artificial Intelligence and Robotics, Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China
S
Shangyi Luo
Center for Artificial Intelligence and Robotics, Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China
Z
Zhiyuan Deng
Center for Artificial Intelligence and Robotics, Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China
Yanbo Chen
Yanbo Chen
Tsinghua University
RoboticsAutonomous NavigationMotion Planning
Y
Yuanhao Chiang
Center for Artificial Intelligence and Robotics, Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China
Yi Liu
Yi Liu
清华大学
机器人视觉 SLAM
Fangming Liu
Fangming Liu
Professor, School of Computer Science & Technology, Huazhong University of Science & Technology
AI & Cloud ComputingDatacenterLLM SystemEdge ComputingGreen Computing
Xueqian Wang
Xueqian Wang
Tsinghua University
Information FusionTarget DetectionRadar ImagingImage Processing