🤖 AI Summary
Existing camera-based 3D semantic scene completion (SSC) methods struggle to reconstruct occluded regions—such as the vehicle’s lateral blind zones—due to reliance solely on the current frame and ineffective temporal fusion that fails to leverage contextual information from historical frames. To address this, we propose C3DFusion, a geometry-aware spatiotemporal fusion module that aligns 3D features across frames. Specifically, it first applies context-aware blurring to historical features to enhance robustness against pose uncertainty and sensor noise, then employs point-cloud-guided voxel densification to enable precise cross-frame feature aggregation. Evaluated on SemanticKITTI and SSCBench-KITTI-360, our method achieves significant improvements over state-of-the-art approaches, particularly in semantic and geometric reconstruction accuracy within lateral blind regions. Moreover, C3DFusion demonstrates strong generalizability across diverse backbone architectures, validating its architectural flexibility and effectiveness.
📝 Abstract
Recent camera-based 3D semantic scene completion (SSC) methods have increasingly explored leveraging temporal cues to enrich the features of the current frame. However, while these approaches primarily focus on enhancing in-frame regions, they often struggle to reconstruct critical out-of-frame areas near the sides of the ego-vehicle, although previous frames commonly contain valuable contextual information about these unseen regions. To address this limitation, we propose the Current-Centric Contextual 3D Fusion (C3DFusion) module, which generates hidden region-aware 3D feature geometry by explicitly aligning 3D-lifted point features from both current and historical frames. C3DFusion performs enhanced temporal fusion through two complementary techniques-historical context blurring and current-centric feature densification-which suppress noise from inaccurately warped historical point features by attenuating their scale, and enhance current point features by increasing their volumetric contribution. Simply integrated into standard SSC architectures, C3DFusion demonstrates strong effectiveness, significantly outperforming state-of-the-art methods on the SemanticKITTI and SSCBench-KITTI-360 datasets. Furthermore, it exhibits robust generalization, achieving notable performance gains when applied to other baseline models.