Diffusion-Guided Gaussian Splatting for Large-Scale Unconstrained 3D Reconstruction and Novel View Synthesis

📅 2025-04-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address quality degradation in 3D reconstruction and novel-view synthesis under large-scale, unconstrained settings—caused by sparse and non-uniform input views leading to geometric/texture distortions (e.g., from transient occlusions, appearance variations, and inconsistent camera calibration)—this paper proposes GS-Diff, a diffusion-guided Gaussian Splatting framework. Methodologically, it introduces the first multi-view conditional diffusion model to synthesize high-fidelity pseudo-observations, transforming ill-posed reconstruction into a well-constrained problem. It further integrates monocular depth priors, dynamic object modeling, appearance-embedding-driven anisotropic covariance regularization, and adaptive rasterization. Evaluated on four mainstream benchmarks, GS-Diff achieves significant improvements over state-of-the-art methods, delivering more robust and fine-grained geometric and textural reconstructions under challenging conditions—including sparse viewpoints, severe occlusions, and substantial illumination changes.

Technology Category

Application Category

📝 Abstract
Recent advancements in 3D Gaussian Splatting (3DGS) and Neural Radiance Fields (NeRF) have achieved impressive results in real-time 3D reconstruction and novel view synthesis. However, these methods struggle in large-scale, unconstrained environments where sparse and uneven input coverage, transient occlusions, appearance variability, and inconsistent camera settings lead to degraded quality. We propose GS-Diff, a novel 3DGS framework guided by a multi-view diffusion model to address these limitations. By generating pseudo-observations conditioned on multi-view inputs, our method transforms under-constrained 3D reconstruction problems into well-posed ones, enabling robust optimization even with sparse data. GS-Diff further integrates several enhancements, including appearance embedding, monocular depth priors, dynamic object modeling, anisotropy regularization, and advanced rasterization techniques, to tackle geometric and photometric challenges in real-world settings. Experiments on four benchmarks demonstrate that GS-Diff consistently outperforms state-of-the-art baselines by significant margins.
Problem

Research questions and friction points this paper is trying to address.

Improves 3D reconstruction in large-scale unconstrained environments
Addresses sparse data and inconsistent camera settings challenges
Enhances geometric and photometric accuracy in real-world settings
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-view diffusion model guides 3DGS framework
Generates pseudo-observations for robust sparse data optimization
Integrates appearance embedding and depth priors enhancements
🔎 Similar Papers
No similar papers found.