WildOS: Open-Vocabulary Object Search in the Wild

📅 2026-02-22

📈 Citations: 0

✨ Influential: 0

career value

241K/year

🤖 AI Summary

This work addresses the challenge of semantic navigation in complex, unmapped outdoor environments with limited depth perception, where conventional geometric frontier-based exploration struggles to balance efficiency and safety. The authors propose WildOS, a novel system that unifies open-vocabulary object search and geometrically safe exploration within a sparse navigation graph framework. Leveraging a vision foundation model, ExploRFM, WildOS jointly predicts traversability, visual frontiers, and semantic similarity to target objects, while incorporating a particle-filter-based coarse localization mechanism to enable long-range semantic navigation beyond sensor range. Experimental results demonstrate that WildOS significantly outperforms purely geometric or purely visual baselines across diverse natural and urban scenes, achieving notable advances in navigation efficiency and autonomy, and validating the efficacy of vision foundation models in enabling open-world robotic behaviors.

Technology Category

Application Category

📝 Abstract

Autonomous navigation in complex, unstructured outdoor environments requires robots to operate over long ranges without prior maps and limited depth sensing. In such settings, relying solely on geometric frontiers for exploration is often insufficient. In such settings, the ability to reason semantically about where to go and what is safe to traverse is crucial for robust, efficient exploration. This work presents WildOS, a unified system for long-range, open-vocabulary object search that combines safe geometric exploration with semantic visual reasoning. WildOS builds a sparse navigation graph to maintain spatial memory, while utilizing a foundation-model-based vision module, ExploRFM, to score frontier nodes of the graph. ExploRFM simultaneously predicts traversability, visual frontiers, and object similarity in image space, enabling real-time, onboard semantic navigation tasks. The resulting vision-scored graph enables the robot to explore semantically meaningful directions while ensuring geometric safety. Furthermore, we introduce a particle-filter-based method for coarse localization of the open-vocabulary target query, that estimates candidate goal positions beyond the robot's immediate depth horizon, enabling effective planning toward distant goals. Extensive closed-loop field experiments across diverse off-road and urban terrains demonstrate that WildOS enables robust navigation, significantly outperforming purely geometric and purely vision-based baselines in both efficiency and autonomy. Our results highlight the potential of vision foundation models to drive open-world robotic behaviors that are both semantically informed and geometrically grounded. Project Page: https://leggedrobotics.github.io/wildos/

Problem

Research questions and friction points this paper is trying to address.

open-vocabulary object search

autonomous navigation

semantic exploration

geometric safety

vision foundation models

Innovation

Methods, ideas, or system contributions that make the work stand out.

open-vocabulary object search

foundation models

semantic navigation