AURA: Multimodal Shared Autonomy for Real-World Urban Navigation

📅 2026-04-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the fatigue, inefficiency, and safety risks associated with prolonged human-operated navigation in complex urban environments by proposing AURA, a multimodal shared autonomy framework. AURA decouples navigation into high-level semantic instructions from humans and low-level motion control executed by AI. It introduces a novel spatial-aware instruction encoder that precisely aligns vision-language commands with environmental context and is supported by MM-CoS, a large-scale multimodal collaborative dataset. Experimental results demonstrate that AURA significantly reduces human intervention frequency by over 44% in both simulated and real-world settings, thereby enhancing navigation stability and execution efficiency.
📝 Abstract
Long-horizon navigation in complex urban environments relies heavily on continuous human operation, which leads to fatigue, reduced efficiency, and safety concerns. Shared autonomy, where a Vision-Language AI agent and a human operator collaborate on maneuvering the mobile machine, presents a promising solution to address these issues. However, existing shared autonomy methods often require humans and AI to operate within the same action space, leading to high cognitive overhead. We present Assistive Urban Robot Autonomy (AURA), a new multi-modal framework that decomposes urban navigation into high-level human instruction and low-level AI control. AURA incorporates a Spatial-Aware Instruction Encoder to align various human instructions with visual and spatial context. To facilitate training, we construct MM-CoS, a large-scale dataset comprising teleoperation and vision-language descriptions. Experiments in simulation and the real world demonstrate that AURA effectively follows human instructions, reduces manual operation effort, and improves navigation stability, while enabling online adaptation. Moreover, under similar takeover conditions, our shared autonomy framework reduces the frequency of takeovers by more than 44%. Demo video and more detail are provided in the project page.
Problem

Research questions and friction points this paper is trying to address.

shared autonomy
urban navigation
human-AI collaboration
cognitive overhead
long-horizon navigation
Innovation

Methods, ideas, or system contributions that make the work stand out.

shared autonomy
multimodal learning
vision-language navigation
human-robot collaboration
spatial-aware instruction encoding
🔎 Similar Papers
No similar papers found.