Enhancing Safety of Foundation Models for Visual Navigation through Collision Avoidance via Repulsive Estimation

📅 2025-06-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the poor out-of-distribution (OOD) generalization and high collision rates of pure RGB-based visual navigation models, this paper proposes CARE—a plug-and-play safety enhancement module. CARE requires no fine-tuning of pretrained models and operates without depth sensors, leveraging monocular depth estimation to generate physics-inspired repulsive force vectors for real-time, trajectory-level re-planning. Its core innovation lies in achieving universal visual navigation safety enhancement with zero fine-tuning and zero hardware modification. Experiments across multiple ROS-based robotic platforms demonstrate that CARE reduces collision rates by up to 100%, increases collision-free traversal distance in exploration tasks by 10.7×, and preserves original goal-reaching performance—without compromising navigation accuracy or efficiency.

Technology Category

Application Category

📝 Abstract
We propose CARE (Collision Avoidance via Repulsive Estimation), a plug-and-play module that enhances the safety of vision-based navigation without requiring additional range sensors or fine-tuning of pretrained models. While recent foundation models using only RGB inputs have shown strong performance, they often fail to generalize in out-of-distribution (OOD) environments with unseen objects or variations in camera parameters (e.g., field of view, pose, or focal length). Without fine-tuning, these models may generate unsafe trajectories that lead to collisions, requiring costly data collection and retraining. CARE addresses this limitation by seamlessly integrating with any RGB-based navigation system that outputs local trajectories, dynamically adjusting them using repulsive force vectors derived from monocular depth maps. We evaluate CARE by combining it with state-of-the-art vision-based navigation models across multiple robot platforms. CARE consistently reduces collision rates (up to 100%) without sacrificing goal-reaching performance and improves collision-free travel distance by up to 10.7x in exploration tasks.
Problem

Research questions and friction points this paper is trying to address.

Enhancing safety of vision-based navigation without extra sensors
Addressing failure in unseen environments with varied objects and camera settings
Reducing collision rates without compromising goal-reaching performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Plug-and-play module enhances vision-based navigation safety
Uses repulsive force vectors from monocular depth maps
Reduces collisions without fine-tuning pretrained models
🔎 Similar Papers
No similar papers found.