🤖 AI Summary
This work addresses a critical limitation in existing Vision-and-Language Navigation (VLN) systems, which assume that the target specified in an instruction always exists and thus struggle with instructions based on false premises. The study presents the first systematic investigation of such scenarios, introducing the VLN-NF benchmark that requires agents to actively explore environments and explicitly determine when a target is “NOT-FOUND.” To tackle this challenge, the authors propose ROAM, a method combining supervised room-level navigation with fine-grained exploration driven by large language models (LLMs) or vision-language models (VLMs) leveraging free-space priors. Key contributions include a scalable data generation pipeline, a novel evaluation metric (REV-SPL), and a two-stage hybrid navigation strategy. Experiments demonstrate that ROAM significantly outperforms baseline methods on VLN-NF, which often fail due to insufficient exploration and consequent misjudgment.
📝 Abstract
Conventional Vision-and-Language Navigation (VLN) benchmarks assume instructions are feasible and the referenced target exists, leaving agents ill-equipped to handle false-premise goals. We introduce VLN-NF, a benchmark with false-premise instructions where the target is absent from the specified room and agents must navigate, gather evidence through in-room exploration, and explicitly output NOT-FOUND. VLN-NF is constructed via a scalable pipeline that rewrites VLN instructions using an LLM and verifies target absence with a VLM, producing plausible yet factually incorrect goals. We further propose REV-SPL to jointly evaluate room reaching, exploration coverage, and decision correctness. To address this challenge, we present ROAM, a two-stage hybrid that combines supervised room-level navigation with LLM/VLM-driven in-room exploration guided by a free-space clearance prior. ROAM achieves the best REV-SPL among compared methods, while baselines often under-explore and terminate prematurely under unreliable instructions. VLN-NF project page can be found at https://vln-nf.github.io/.