OccAny: Generalized Unconstrained Urban 3D Occupancy

📅 2026-03-24

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

This work addresses the limited generalizability of existing 3D occupancy prediction methods, which rely heavily on precise sensor calibration and in-domain annotations, rendering them ill-suited for unconstrained urban environments. To overcome this, we propose OccAny—the first universal 3D occupancy prediction model that operates without requiring sensor calibration and supports diverse multi-view inputs, including monocular, sequential, and surround-view configurations. Our key innovations include a unified 3D occupancy framework, a segmentation-enforced mechanism to enhance occupancy quality and enable mask-level semantic prediction, and a test-time geometric completion strategy based on novel view synthesis. Extensive experiments demonstrate that OccAny consistently outperforms current visual-geometry baselines across all three input settings on two major urban scene datasets, achieving performance on par with in-domain self-supervised approaches.

Technology Category

Application Category

📝 Abstract

Relying on in-domain annotations and precise sensor-rig priors, existing 3D occupancy prediction methods are limited in both scalability and out-of-domain generalization. While recent visual geometry foundation models exhibit strong generalization capabilities, they were mainly designed for general purposes and lack one or more key ingredients required for urban occupancy prediction, namely metric prediction, geometry completion in cluttered scenes and adaptation to urban scenarios. We address this gap and present OccAny, the first unconstrained urban 3D occupancy model capable of operating on out-of-domain uncalibrated scenes to predict and complete metric occupancy coupled with segmentation features. OccAny is versatile and can predict occupancy from sequential, monocular, or surround-view images. Our contributions are three-fold: (i) we propose the first generalized 3D occupancy framework with (ii) Segmentation Forcing that improves occupancy quality while enabling mask-level prediction, and (iii) a Novel View Rendering pipeline that infers novel-view geometry to enable test-time view augmentation for geometry completion. Extensive experiments demonstrate that OccAny outperforms all visual geometry baselines on 3D occupancy prediction task, while remaining competitive with in-domain self-supervised methods across three input settings on two established urban occupancy prediction datasets. Our code is available at https://github.com/valeoai/OccAny .

Problem

Research questions and friction points this paper is trying to address.

3D occupancy prediction

out-of-domain generalization

urban scenes

metric prediction

geometry completion

Innovation

Methods, ideas, or system contributions that make the work stand out.

3D occupancy prediction

generalization

segmentation forcing