🤖 AI Summary
High accuracy in existing 3D scene understanding models is undermined by poor uncertainty calibration, limiting their deployment in safety-critical applications. To address this, we systematically evaluate the uncertainty calibration performance of 28 state-of-the-art models across 10 diverse 3D perception datasets, establishing the first dedicated benchmark for uncertainty calibration in 3D perception. We propose Depth-aware Probabilistic Temperature Scaling (DeptS), a novel calibration method that dynamically refines confidence scores by incorporating LiDAR depth priors. Furthermore, we conduct multi-dimensional calibration diagnostics—assessing network capacity, LiDAR representation fidelity, rasterization resolution, and 3D data augmentation—to identify key factors influencing calibration quality. Experiments demonstrate that DeptS significantly reduces calibration errors, notably Expected Calibration Error (ECE). We publicly release a unified evaluation toolkit to foster reproducible and reliable 3D perception research.
📝 Abstract
Safety-critical 3D scene understanding tasks necessitate not only accurate but also confident predictions from 3D perception models. This study introduces Calib3D, a pioneering effort to benchmark and scrutinize the reliability of 3D scene understanding models from an uncertainty estimation viewpoint. We comprehensively evaluate 28 state-of-the-art models across 10 diverse 3D datasets, uncovering insightful phenomena that cope with both the aleatoric and epistemic uncertainties in 3D scene understanding. We discover that despite achieving impressive levels of accuracy, existing models frequently fail to provide reliable uncertainty estimates -- a pitfall that critically undermines their applicability in safety-sensitive contexts. Through extensive analysis of key factors such as network capacity, LiDAR representations, rasterization resolutions, and 3D data augmentation techniques, we correlate these aspects directly with the model calibration efficacy. Furthermore, we introduce DeptS, a novel depth-aware scaling approach aimed at enhancing 3D model calibration. Extensive experiments across a wide range of configurations validate the superiority of our method. We hope this work could serve as a cornerstone for fostering reliable 3D scene understanding. Code and benchmark toolkit are publicly available.