๐ค AI Summary
To address information loss, sensitivity to image quality, and parameter redundancy in sparse-view 3D reconstruction, this paper proposes a two-stage self-enhancing Gaussian Splatting framework. In the first stage, an initial 3D Gaussian field is constructed from sparse inputs and used to render multi-view images. In the second stage, a Stable Diffusionโstyle 2D diffusion model is fine-tuned to enhance the rendered images, and the enhanced outputs are leveraged to refine the 3D representation via back-projection. A novel structured mask is introduced to jointly regularize geometry and appearance, significantly improving robustness against view occlusion and noise. This self-enhancing closed-loop mechanism is the first of its kind. The method achieves state-of-the-art perceptual quality and multi-view consistency on MipNeRF360, OmniObject3D, and OpenIllumination, while substantially reducing dependency on both the number and quality of input views.
๐ Abstract
Sparse-view 3D reconstruction is a major challenge in computer vision, aiming to create complete three-dimensional models from limited viewing angles. Key obstacles include: 1) a small number of input images with inconsistent information; 2) dependence on input image quality; and 3) large model parameter sizes. To tackle these issues, we propose a self-augmented two-stage Gaussian splatting framework enhanced with structural masks for sparse-view 3D reconstruction. Initially, our method generates a basic 3D Gaussian representation from sparse inputs and renders multi-view images. We then fine-tune a pre-trained 2D diffusion model to enhance these images, using them as augmented data to further optimize the 3D Gaussians. Additionally, a structural masking strategy during training enhances the model's robustness to sparse inputs and noise. Experiments on benchmarks like MipNeRF360, OmniObject3D, and OpenIllumination demonstrate that our approach achieves state-of-the-art performance in perceptual quality and multi-view consistency with sparse inputs.