SGFormer: Satellite-Ground Fusion for 3D Semantic Scene Completion

📅 2025-03-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address incomplete scene semantic completion (SSC) caused by occlusions in ground-level camera views, this paper proposes SGFormer—the first satellite-ground collaborative SSC framework. Methodologically, it introduces a dual-branch Transformer encoder and establishes the first satellite-ground image pairing fusion paradigm. A ground-view-guided satellite feature rectification mechanism and an adaptive cross-view weighting strategy are further designed to achieve cross-domain feature alignment and mitigate viewpoint-induced geometric biases. Extensive experiments on SemanticKITTI and SSCBench-KITTI-360 demonstrate significant improvements over state-of-the-art methods, marking the first empirical validation of multi-view remote sensing–ground collaboration for 3D semantic completion. The framework effectively bridges the semantic and geometric gaps between heterogeneous sensors, enabling robust occlusion-resilient SSC. Code is publicly available.

Technology Category

Application Category

📝 Abstract
Recently, camera-based solutions have been extensively explored for scene semantic completion (SSC). Despite their success in visible areas, existing methods struggle to capture complete scene semantics due to frequent visual occlusions. To address this limitation, this paper presents the first satellite-ground cooperative SSC framework, i.e., SGFormer, exploring the potential of satellite-ground image pairs in the SSC task. Specifically, we propose a dual-branch architecture that encodes orthogonal satellite and ground views in parallel, unifying them into a common domain. Additionally, we design a ground-view guidance strategy that corrects satellite image biases during feature encoding, addressing misalignment between satellite and ground views. Moreover, we develop an adaptive weighting strategy that balances contributions from satellite and ground views. Experiments demonstrate that SGFormer outperforms the state of the art on SemanticKITTI and SSCBench-KITTI-360 datasets. Our code is available on https://github.com/gxytcrc/SGFormer.
Problem

Research questions and friction points this paper is trying to address.

Addresses incomplete scene semantics due to visual occlusions
Proposes satellite-ground fusion for 3D semantic completion
Resolves misalignment between satellite and ground views
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-branch architecture for satellite-ground fusion
Ground-view guidance corrects satellite biases
Adaptive weighting balances view contributions
🔎 Similar Papers
No similar papers found.
X
Xiyue Guo
State Key Lab of CAD&CG, Zhejiang University
Jiarui Hu
Jiarui Hu
Zhejiang University
Computer Vision Robotics Computer Graphics
J
Junjie Hu
Chinese University of Hong Kong, Shenzhen
H
Hujun Bao
State Key Lab of CAD&CG, Zhejiang University
G
Guofeng Zhang
State Key Lab of CAD&CG, Zhejiang University