IGFuse: Interactive 3D Gaussian Scene Reconstruction via Multi-Scans Fusion

📅 2025-08-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address incomplete 3D scene reconstruction caused by occlusions and limited field-of-view in single-scan acquisition, this paper proposes an interactive multi-scan Gaussian splatting method for high-fidelity scene reconstruction. Our approach features: (1) segmentation-aware Gaussian field modeling, jointly enforcing photometric and semantic consistency; (2) a pseudo-intermediate scene representation enabling robust cross-scan alignment without requiring dense scanning; and (3) a collaborative co-pruning geometric optimization strategy supporting object-level editing. Evaluated on real indoor scenes, our method achieves high-fidelity novel-view synthesis, real-time interactive editing, and cross-configuration generalization. It significantly improves reconstruction completeness and editability compared to prior single-scan methods, establishing a new paradigm for real-to-simulation transfer.

Technology Category

Application Category

📝 Abstract
Reconstructing complete and interactive 3D scenes remains a fundamental challenge in computer vision and robotics, particularly due to persistent object occlusions and limited sensor coverage. Multiview observations from a single scene scan often fail to capture the full structural details. Existing approaches typically rely on multi stage pipelines, such as segmentation, background completion, and inpainting or require per-object dense scanning, both of which are error-prone, and not easily scalable. We propose IGFuse, a novel framework that reconstructs interactive Gaussian scene by fusing observations from multiple scans, where natural object rearrangement between captures reveal previously occluded regions. Our method constructs segmentation aware Gaussian fields and enforces bi-directional photometric and semantic consistency across scans. To handle spatial misalignments, we introduce a pseudo-intermediate scene state for unified alignment, alongside collaborative co-pruning strategies to refine geometry. IGFuse enables high fidelity rendering and object level scene manipulation without dense observations or complex pipelines. Extensive experiments validate the framework's strong generalization to novel scene configurations, demonstrating its effectiveness for real world 3D reconstruction and real-to-simulation transfer. Our project page is available online.
Problem

Research questions and friction points this paper is trying to address.

Reconstructing complete 3D scenes despite occlusions and limited sensor coverage
Overcoming incomplete structural details from single-scene multiview observations
Avoiding error-prone multi-stage pipelines or per-object dense scanning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fuses multi-scan observations for occlusion removal
Uses segmentation-aware Gaussian fields for consistency
Introduces pseudo-intermediate state for alignment
🔎 Similar Papers
No similar papers found.
W
Wenhao Hu
College of Computer Science and Technology, Zhejiang University
Z
Zesheng Li
Nanyang Technological University
Haonan Zhou
Haonan Zhou
HKU Business School
L
Liu Liu
Horizon Robotics
X
Xuexiang Wen
ZJU-UIUC Institute, Zhejiang University
Zhizhong Su
Zhizhong Su
Horizon Robotics
Deep LearningComputer VisionAutonomous DrivingRobotics Learning
X
Xi Li
College of Computer Science and Technology, Zhejiang University
Gaoang Wang
Gaoang Wang
Zhejiang University / University of Illinois Urbana-Champaign Institute
Embodied AgentComputer VisionMachine Learning