Constraint-Aware Zero-Shot Vision-Language Navigation in Continuous Environments

๐Ÿ“… 2024-12-13
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 2
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Zero-shot vision-language navigation in continuous environments (VLN-CE) faces three core challenges: absence of expert demonstrations, weak environmental priors, and a continuous action space. Method: We propose a constraint-aware sub-instruction sequence modeling framework. It introduces, for the first time, a constraint-driven dynamic sub-instruction decomposition and switching mechanism, coupled with a superpixel-guided online refinement of value maps to enable real-time value estimation and robust decision-making. Contribution/Results: Our approach overcomes the dual limitations of trajectory scarcity and structural prior deficiency inherent in zero-shot settings. On the R2R-CE and RxR-CE unseen test sets, it achieves state-of-the-art success ratesโ€”improving over prior work by 12% and 13%, respectively. The method has been successfully deployed on multiple real-world indoor robotic platforms across diverse scenarios.

Technology Category

Application Category

๐Ÿ“ Abstract
We address the task of Vision-Language Navigation in Continuous Environments (VLN-CE) under the zero-shot setting. Zero-shot VLN-CE is particularly challenging due to the absence of expert demonstrations for training and minimal environment structural prior to guide navigation. To confront these challenges, we propose a Constraint-Aware Navigator (CA-Nav), which reframes zero-shot VLN-CE as a sequential, constraint-aware sub-instruction completion process. CA-Nav continuously translates sub-instructions into navigation plans using two core modules: the Constraint-Aware Sub-instruction Manager (CSM) and the Constraint-Aware Value Mapper (CVM). CSM defines the completion criteria for decomposed sub-instructions as constraints and tracks navigation progress by switching sub-instructions in a constraint-aware manner. CVM, guided by CSM's constraints, generates a value map on the fly and refines it using superpixel clustering to improve navigation stability. CA-Nav achieves the state-of-the-art performance on two VLN-CE benchmarks, surpassing the previous best method by 12 percent and 13 percent in Success Rate on the validation unseen splits of R2R-CE and RxR-CE, respectively. Moreover, CA-Nav demonstrates its effectiveness in real-world robot deployments across various indoor scenes and instructions.
Problem

Research questions and friction points this paper is trying to address.

Zero-shot Vision-Language Navigation without expert training data
Continuous environment navigation with minimal structural prior knowledge
Constraint-aware sub-instruction completion for stable navigation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Constraint-Aware Sub-instruction Manager tracks progress
Constraint-Aware Value Mapper refines navigation stability
Sequential constraint-aware sub-instruction completion process
๐Ÿ”Ž Similar Papers
No similar papers found.
K
Kehan Chen
New Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences and School of Artificial Intelligence, University of Chinese Academy of Sciences, China
D
Dongyan An
Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), Abu Dhabi, UAE
Y
Yan Huang
New Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences and School of Artificial Intelligence, University of Chinese Academy of Sciences, China
Rongtao Xu
Rongtao Xu
MBZUAI << CASIA << HUST
Intelligent RobotEmbodied AIVLAVLMSpatialtemporal AI
Yifei Su
Yifei Su
Institute of Automation, Chinese Academy of Sciences
Embodied AIMultimodal Learning
Yonggen Ling
Yonggen Ling
Tencent Robotics X
SLAMVIOSenor FusionComputer Vision3D Reconstruction
I
Ian Reid
Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), Abu Dhabi, UAE
L
Liang Wang
New Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences and School of Artificial Intelligence, University of Chinese Academy of Sciences, China