CATNAV: Cached Vision-Language Traversability for Efficient Zero-Shot Robot Navigation

πŸ“… 2026-03-24
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenge of efficiently evaluating path risk according to a robot’s physical constraints during zero-shot navigation in unstructured environments. The authors propose an embodied navigation framework that requires no task-specific training, leveraging a vision-language model (VLM) to generate an embodiment-aware cost map. To minimize online VLM queries, they introduce a visual-semantic caching mechanism that reuses historical risk assessments. Furthermore, a VLM-driven trajectory selection module is designed to ensure safe path planning under behavioral constraints. Experiments on a quadrupedal robot demonstrate that the proposed approach improves target arrival success by 10 percentage points, reduces behavioral violations by 33%, and decreases online VLM invocations by 85.7%.

Technology Category

Application Category

πŸ“ Abstract
Navigating unstructured environments requires assessing traversal risk relative to a robot's physical capabilities, a challenge that varies across embodiments. We present CATNAV, a cost-aware traversability navigation framework that leverages multimodal LLMs for zero-shot, embodiment-aware costmap generation without task-specific training. We introduce a visuosemantic caching mechanism that detects scene novelty and reuses prior risk assessments for semantically similar frames, reducing online VLM queries by 85.7%. Furthermore, we introduce a VLM-based trajectory selection module that evaluates proposals through visual reasoning to choose the safest path given behavioral constraints. We evaluate CATNAV on a quadruped robot across indoor and outdoor unstructured environments, comparing against state-of-the-art vision-language-action baselines. Across five navigation tasks, CATNAV achieves 10 percentage point higher average goal-reaching rate and 33% fewer behavioral constraint violations.
Problem

Research questions and friction points this paper is trying to address.

robot navigation
traversability assessment
zero-shot learning
embodiment-aware
unstructured environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

zero-shot navigation
visuosemantic caching
embodiment-aware traversability
vision-language models
costmap generation
πŸ”Ž Similar Papers
No similar papers found.
A
Aditya Potnis
Field Robotics Engineering and Science Hub (FRESH), Illinois Autonomous Farm, University of Illinois at Urbana-Champaign (UIUC), IL
F
Francisco Affonso
Field Robotics Engineering and Science Hub (FRESH), Illinois Autonomous Farm, University of Illinois at Urbana-Champaign (UIUC), IL
S
Shreya Gummadi
Field Robotics Engineering and Science Hub (FRESH), Illinois Autonomous Farm, University of Illinois at Urbana-Champaign (UIUC), IL
Naveen Kumar Uppalapati
Naveen Kumar Uppalapati
CDA, NCSA, University of Illinois at Urbana Champaign
RoboticsAutonomous SystemsAgricultural roboticsDeep Learning
Girish Chowdhary
Girish Chowdhary
Associate Professor
RoboticsAgricultural RoboticsAdaptive ControlMobile Robotics