🤖 AI Summary
This work addresses the performance limitations of 3D Gaussian Splatting (3DGS) under sparse-view conditions, where inaccurate camera poses and poor point cloud initialization degrade reconstruction quality. To overcome these challenges, the authors propose a robust dense initialization strategy by integrating the reference-free point cloud estimation network π³ with 3DGS for the first time. Furthermore, they introduce several geometric regularization mechanisms—including uncertainty-guided depth supervision, normal consistency loss, and depth warping—to enhance geometric fidelity. These innovations significantly improve both the geometric accuracy and rendering quality of novel view synthesis. The method achieves state-of-the-art results across multiple benchmarks, including Tanks and Temples, LLFF, DTU, and MipNeRF360.
📝 Abstract
Novel view synthesis has evolved rapidly, advancing from Neural Radiance Fields to 3D Gaussian Splatting (3DGS), which offers real-time rendering and rapid training without compromising visual fidelity. However, 3DGS relies heavily on accurate camera poses and high-quality point cloud initialization, which are difficult to obtain in sparse-view scenarios. While traditional Structure from Motion (SfM) pipelines often fail in these settings, existing learning-based point estimation alternatives typically require reliable reference views and remain sensitive to pose or depth errors. In this work, we propose a robust method utilizing {\pi}^3, a reference-free point cloud estimation network. We integrate dense initialization from {\pi}^3 with a regularization scheme designed to mitigate geometric inaccuracies. Specifically, we employ uncertainty-guided depth supervision, normal consistency loss, and depth warping. Experimental results demonstrate that our approach achieves state-of-the-art performance on the Tanks and Temples, LLFF, DTU, and MipNeRF360 datasets.