Diffusion-Guided Gaussian Splatting for Large-Scale Unconstrained 3D Reconstruction and Novel View Synthesis

📅 2025-04-02

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

To address quality degradation in 3D reconstruction and novel-view synthesis under large-scale, unconstrained settings—caused by sparse and non-uniform input views leading to geometric/texture distortions (e.g., from transient occlusions, appearance variations, and inconsistent camera calibration)—this paper proposes GS-Diff, a diffusion-guided Gaussian Splatting framework. Methodologically, it introduces the first multi-view conditional diffusion model to synthesize high-fidelity pseudo-observations, transforming ill-posed reconstruction into a well-constrained problem. It further integrates monocular depth priors, dynamic object modeling, appearance-embedding-driven anisotropic covariance regularization, and adaptive rasterization. Evaluated on four mainstream benchmarks, GS-Diff achieves significant improvements over state-of-the-art methods, delivering more robust and fine-grained geometric and textural reconstructions under challenging conditions—including sparse viewpoints, severe occlusions, and substantial illumination changes.

Technology Category

Application Category

📝 Abstract

Recent advancements in 3D Gaussian Splatting (3DGS) and Neural Radiance Fields (NeRF) have achieved impressive results in real-time 3D reconstruction and novel view synthesis. However, these methods struggle in large-scale, unconstrained environments where sparse and uneven input coverage, transient occlusions, appearance variability, and inconsistent camera settings lead to degraded quality. We propose GS-Diff, a novel 3DGS framework guided by a multi-view diffusion model to address these limitations. By generating pseudo-observations conditioned on multi-view inputs, our method transforms under-constrained 3D reconstruction problems into well-posed ones, enabling robust optimization even with sparse data. GS-Diff further integrates several enhancements, including appearance embedding, monocular depth priors, dynamic object modeling, anisotropy regularization, and advanced rasterization techniques, to tackle geometric and photometric challenges in real-world settings. Experiments on four benchmarks demonstrate that GS-Diff consistently outperforms state-of-the-art baselines by significant margins.

Problem

Research questions and friction points this paper is trying to address.

Improves 3D reconstruction in large-scale unconstrained environments

Addresses sparse data and inconsistent camera settings challenges

Enhances geometric and photometric accuracy in real-world settings

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-view diffusion model guides 3DGS framework

Generates pseudo-observations for robust sparse data optimization

Integrates appearance embedding and depth priors enhancements

🔎 Similar Papers

No similar papers found.