๐ค AI Summary
In sparse-view settings, 3D Gaussian Splatting (3DGS) suffers from blurry artifacts in novel-view synthesis due to suboptimal initialization. Prior work reveals that initialization fundamentally constrains performance upper bounds, while training regularization yields only marginal gains. This paper addresses the core challenge of constructing a superior initial point cloud, introducing three key innovations: (1) frequency-aware Structure-from-Motion (SfM), enhancing robustness to low-frequency view distributions; (2) self-initialization for 3DGS, leveraging photometric supervision and geometric visibility priors to generate high-confidence Gaussian centers; and (3) relaxed multi-view matchingโdriven point cloud regularization, improving sparse-region coverage and multi-view consistency. Evaluated under sparse-view protocols on LLFF and Mip-NeRF360 benchmarks, our method consistently outperforms state-of-the-art approaches, achieving significant improvements across PSNR, SSIM, and LPIPS metrics while demonstrating stronger generalization.
๐ Abstract
Sparse-view 3D Gaussian Splatting (3DGS) often overfits to the training views, leading to artifacts like blurring in novel view rendering. Prior work addresses it either by enhancing the initialization (emph{i.e.}, the point cloud from Structure-from-Motion (SfM)) or by adding training-time constraints (regularization) to the 3DGS optimization. Yet our controlled ablations reveal that initialization is the decisive factor: it determines the attainable performance band in sparse-view 3DGS, while training-time constraints yield only modest within-band improvements at extra cost. Given initialization's primacy, we focus our design there. Although SfM performs poorly under sparse views due to its reliance on feature matching, it still provides reliable seed points. Thus, building on SfM, our effort aims to supplement the regions it fails to cover as comprehensively as possible. Specifically, we design: (i) frequency-aware SfM that improves low-texture coverage via low-frequency view augmentation and relaxed multi-view correspondences; (ii) 3DGS self-initialization that lifts photometric supervision into additional points, compensating SfM-sparse regions with learned Gaussian centers; and (iii) point-cloud regularization that enforces multi-view consistency and uniform spatial coverage through simple geometric/visibility priors, yielding a clean and reliable point cloud. Our experiments on LLFF and Mip-NeRF360 demonstrate consistent gains in sparse-view settings, establishing our approach as a stronger initialization strategy. Code is available at https://github.com/zss171999645/ItG-GS.