Segmentation-Driven Initialization for Sparse-view 3D Gaussian Splatting

📅 2025-09-15

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

Sparse-view 3D Gaussian Splatting (3DGS) reconstruction suffers from inaccurate geometric recovery and excessive memory consumption: Structure-from-Motion (SfM)-dependent methods fail under extremely sparse views, while SfM-free multi-view stereo (MVS) initialization incurs explosive growth in Gaussian count due to dense pixel-wise back-projection. To address this, we propose the first semantic-segmentation-guided, region-aware initialization method that generates 3D Gaussians exclusively in structurally salient regions—reducing Gaussian count by 50%. Our approach jointly integrates semantic segmentation, sparse-view MVS matching, and 3DGS rendering into an end-to-end, SfM-free reconstruction pipeline. Evaluated on multiple benchmarks, our method achieves state-of-the-art or competitive PSNR and SSIM scores, with a marginal trade-off in LPIPS. Moreover, it significantly accelerates training and reduces GPU memory usage.

Technology Category

Application Category

📝 Abstract

Sparse-view synthesis remains a challenging problem due to the difficulty of recovering accurate geometry and appearance from limited observations. While recent advances in 3D Gaussian Splatting (3DGS) have enabled real-time rendering with competitive quality, existing pipelines often rely on Structure-from-Motion (SfM) for camera pose estimation, an approach that struggles in genuinely sparse-view settings. Moreover, several SfM-free methods replace SfM with multi-view stereo (MVS) models, but generate massive numbers of 3D Gaussians by back-projecting every pixel into 3D space, leading to high memory costs. We propose Segmentation-Driven Initialization for Gaussian Splatting (SDI-GS), a method that mitigates inefficiency by leveraging region-based segmentation to identify and retain only structurally significant regions. This enables selective downsampling of the dense point cloud, preserving scene fidelity while substantially reducing Gaussian count. Experiments across diverse benchmarks show that SDI-GS reduces Gaussian count by up to 50% and achieves comparable or superior rendering quality in PSNR and SSIM, with only marginal degradation in LPIPS. It further enables faster training and lower memory footprint, advancing the practicality of 3DGS for constrained-view scenarios.

Problem

Research questions and friction points this paper is trying to address.

Sparse-view 3D reconstruction from limited observations

Reducing excessive Gaussian count in 3DGS pipelines

Improving efficiency in memory and training time

Innovation

Methods, ideas, or system contributions that make the work stand out.

Segmentation-driven initialization for Gaussian splatting

Selective downsampling of dense point clouds

Reduces Gaussian count while preserving fidelity

🔎 Similar Papers

SAGD: Boundary-Enhanced Segment Anything in 3D Gaussian via Gaussian Decomposition