EZREAL: Enhancing Zero-Shot Outdoor Robot Navigation toward Distant Targets under Varying Visibility

📅 2025-09-17

📈 Citations: 0

✨ Influential: 0

career value

255K/year

🤖 AI Summary

Zero-shot outdoor long-range navigation faces significant challenges including extremely small target size, severe occlusion, and intermittent visibility. This paper proposes a lightweight closed-loop navigation framework that constructs a hierarchical multi-scale image patch structure, integrating target semantics with visual saliency to enable robust directional estimation and visibility awareness for sub-pixel-sized targets at distances exceeding 100 meters. A hierarchical saliency fusion mechanism is introduced, combining keyframe memory with saliency-weighted historical heading integration to support active target search and heading maintenance under occlusion—without requiring full-image downscaling. Evaluated in both simulation and real-world outdoor environments, the system stably detects semantic targets beyond 150 meters; achieves an 82.6% heading accuracy under dynamic visibility conditions; and improves task success rate by 17.5% over state-of-the-art methods.

Technology Category

Application Category

📝 Abstract

Zero-shot object navigation (ZSON) in large-scale outdoor environments faces many challenges; we specifically address a coupled one: long-range targets that reduce to tiny projections and intermittent visibility due to partial or complete occlusion. We present a unified, lightweight closed-loop system built on an aligned multi-scale image tile hierarchy. Through hierarchical target-saliency fusion, it summarizes localized semantic contrast into a stable coarse-layer regional saliency that provides the target direction and indicates target visibility. This regional saliency supports visibility-aware heading maintenance through keyframe memory, saliency-weighted fusion of historical headings, and active search during temporary invisibility. The system avoids whole-image rescaling, enables deterministic bottom-up aggregation, supports zero-shot navigation, and runs efficiently on a mobile robot. Across simulation and real-world outdoor trials, the system detects semantic targets beyond 150m, maintains a correct heading through visibility changes with 82.6% probability, and improves overall task success by 17.5% compared with the SOTA methods, demonstrating robust ZSON toward distant and intermittently observable targets.

Problem

Research questions and friction points this paper is trying to address.

Addresses zero-shot outdoor navigation to distant targets

Handles intermittent visibility from occlusion and range

Enhances navigation robustness without whole-image rescaling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Aligned multi-scale image tile hierarchy

Hierarchical target-saliency fusion technique

Visibility-aware heading maintenance system

🔎 Similar Papers

One Map to Find Them All: Real-time Open-Vocabulary Mapping for Zero-shot Multi-Object Navigation