🤖 AI Summary
Existing open-set semantic mapping methods are constrained by perceptual depth, struggling to jointly model near-field voxels and far-field ray observations while balancing fine-grained semantic accuracy and computational efficiency. This paper proposes a cross-range semantic mapping framework for open-world robots. We introduce RayFronts—a novel representation that jointly encodes semantic boundaries of both proximal voxels and distal ray fronts for the first time. We design a planner-agnostic online evaluation paradigm to decouple mapping performance assessment. Furthermore, we develop a real-time voxel-ray joint mapping architecture that synergistically integrates sparse ray-front representation with zero-shot 3D semantic segmentation. Experiments demonstrate: (1) real-time operation at 8.84 Hz; (2) 1.34× improvement in zero-shot segmentation mAP and 16.5× higher throughput; and (3) 2.2× reduction in far-field search volume—significantly outperforming state-of-the-art online baselines.
📝 Abstract
Open-set semantic mapping is crucial for open-world robots. Current mapping approaches either are limited by the depth range or only map beyond-range entities in constrained settings, where overall they fail to combine within-range and beyond-range observations. Furthermore, these methods make a trade-off between fine-grained semantics and efficiency. We introduce RayFronts, a unified representation that enables both dense and beyond-range efficient semantic mapping. RayFronts encodes task-agnostic open-set semantics to both in-range voxels and beyond-range rays encoded at map boundaries, empowering the robot to reduce search volumes significantly and make informed decisions both within&beyond sensory range, while running at 8.84 Hz on an Orin AGX. Benchmarking the within-range semantics shows that RayFronts's fine-grained image encoding provides 1.34x zero-shot 3D semantic segmentation performance while improving throughput by 16.5x. Traditionally, online mapping performance is entangled with other system components, complicating evaluation. We propose a planner-agnostic evaluation framework that captures the utility for online beyond-range search and exploration, and show RayFronts reduces search volume 2.2x more efficiently than the closest online baselines.