Rule-VLN: Bridging Perception and Compliance via Semantic Reasoning and Geometric Rectification

📅 2026-04-18

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

This work addresses the frequent disregard of semantic regulatory constraints in real-world urban environments by existing vision-language navigation agents, which often leads to non-compliant behaviors. To bridge this gap, the authors introduce Rule-VLN, the first city-scale benchmark for rule-compliant navigation, encompassing 177 rule categories and 8,000 constrained nodes. They further propose SNRM, a general-purpose, zero-shot Semantic Navigation Rectification Module that integrates semantic reasoning with geometric correction. SNRM leverages a coarse-to-fine vision-language model and a cognitive mental map to enable dynamic, perception-rule协同 path replanning. Experimental results demonstrate that SNRM significantly enhances navigation performance on Rule-VLN, reducing violation rates by 19.26% and improving task completion rates by 5.97%.

Technology Category

Application Category

📝 Abstract

As embodied AI transitions to real-world deployment, the success of the Vision-and-Language Navigation (VLN) task tends to evolve from mere reachability to social compliance. However, current agents suffer from a "goal-driven trap", prioritizing physical geometry ("can I go?") over semantic rules ("may I go?"), frequently overlooking subtle regulatory constraints. To bridge this gap, we establish Rule-VLN, the first large-scale urban benchmark for rule-compliant navigation. Spanning a massive 29k-node environment, it injects 177 diverse regulatory categories into 8k constrained nodes across four curriculum levels, challenging agents with fine-grained visual and behavioral constraints. We further propose the Semantic Navigation Rectification Module (SNRM), a universal, zero-shot module designed to equip pre-trained agents with safety awareness. SNRM integrates a coarse-to-fine visual perception VLM framework with an epistemic mental map for dynamic detour planning. Experiments demonstrate that while Rule-VLN challenges state-of-the-art models, SNRM significantly restores navigation capabilities, reducing CVR by 19.26% and boosting TC by 5.97%.

Problem

Research questions and friction points this paper is trying to address.

Vision-and-Language Navigation

social compliance

regulatory constraints

embodied AI

rule-compliant navigation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Rule-VLN

Semantic Navigation Rectification Module

vision-and-language navigation