🤖 AI Summary
Current approaches to autonomous driving panoramic occupancy prediction are hindered by the lack of high-quality 3D mesh assets, instance-level annotations, and physically consistent occupancy data. To address these limitations, this work introduces the first instance-centric benchmark for 3D panoramic occupancy prediction, comprising a high-fidelity 3D model library, ADMesh (containing over 15K models), and a large-scale synthetic dataset, CarlaOcc, built upon the CARLA simulator with more than 100K frames at a voxel resolution of 0.05 m. This benchmark uniquely provides unified mesh resources and fine-grained voxel-wise instance annotations. Furthermore, it establishes standardized evaluation metrics and a reproducible model evaluation platform, thereby advancing research in accurate geometric reconstruction and holistic 3D scene understanding.
📝 Abstract
Panoptic occupancy prediction aims to jointly infer voxel-wise semantics and instance identities within a unified 3D scene representation. Nevertheless, progress in this field remains constrained by the absence of high-quality 3D mesh resources, instance-level annotations, and physically consistent occupancy datasets. Existing benchmarks typically provide incomplete and low-resolution geometry without instance-level annotations, limiting the development of models capable of achieving precise geometric reconstruction, reliable occlusion reasoning, and holistic 3D understanding. To address these challenges, this paper presents an instance-centric benchmark for the 3D panoptic occupancy prediction task. Specifically, we introduce ADMesh, the first unified 3D mesh library tailored for autonomous driving, which integrates over 15K high-quality 3D models with diverse textures and rich semantic annotations. Building upon ADMesh, we further construct CarlaOcc, a large-scale, physically consistent panoptic occupancy dataset generated using the CARLA simulator. This dataset contains over 100K frames with fine-grained, instance-level occupancy ground truth at voxel resolutions as fine as 0.05 m. Furthermore, standardized evaluation metrics are introduced to quantify the quality of existing occupancy datasets. Finally, a systematic benchmark of representative models is established on the proposed dataset, which provides a unified platform for fair comparison and reproducible research in the field of 3D panoptic perception. Code and dataset are available at https://mias.group/CarlaOcc.