Unleashing Semantic and Geometric Priors for 3D Scene Completion

📅 2025-08-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing camera-based 3D Semantic Scene Completion (SSC) methods employ coupled encoders to jointly model semantic and geometric priors, leading to mutual interference and suboptimal performance. To address this, we propose FoundationSSC—a novel framework featuring dual-level decoupling: source-level decoupling for independent extraction of semantic and geometric features, and path-level decoupling for their separate processing. We further introduce an Axis-Aware Fusion module to effectively resolve anisotropic multi-view feature fusion, integrated with hybrid view transformation and foundation-model-driven stereo cost volume construction. On SemanticKITTI, FoundationSSC achieves +0.23 mIoU and +2.03 IoU over prior art; on SSCBench-KITTI-360, it attains 21.78 mIoU and 48.61 IoU—setting new state-of-the-art performance.

Technology Category

Application Category

📝 Abstract
Camera-based 3D semantic scene completion (SSC) provides dense geometric and semantic perception for autonomous driving and robotic navigation. However, existing methods rely on a coupled encoder to deliver both semantic and geometric priors, which forces the model to make a trade-off between conflicting demands and limits its overall performance. To tackle these challenges, we propose FoundationSSC, a novel framework that performs dual decoupling at both the source and pathway levels. At the source level, we introduce a foundation encoder that provides rich semantic feature priors for the semantic branch and high-fidelity stereo cost volumes for the geometric branch. At the pathway level, these priors are refined through specialised, decoupled pathways, yielding superior semantic context and depth distributions. Our dual-decoupling design produces disentangled and refined inputs, which are then utilised by a hybrid view transformation to generate complementary 3D features. Additionally, we introduce a novel Axis-Aware Fusion (AAF) module that addresses the often-overlooked challenge of fusing these features by anisotropically merging them into a unified representation. Extensive experiments demonstrate the advantages of FoundationSSC, achieving simultaneous improvements in both semantic and geometric metrics, surpassing prior bests by +0.23 mIoU and +2.03 IoU on SemanticKITTI. Additionally, we achieve state-of-the-art performance on SSCBench-KITTI-360, with 21.78 mIoU and 48.61 IoU. The code will be released upon acceptance.
Problem

Research questions and friction points this paper is trying to address.

Decoupling semantic and geometric priors for 3D scene completion
Addressing feature fusion challenges in autonomous driving perception
Improving simultaneous semantic and geometric metric performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual decoupling at source and pathway levels
Hybrid view transformation for complementary 3D features
Axis-Aware Fusion module for anisotropic feature merging
🔎 Similar Papers
No similar papers found.