🤖 AI Summary
Sparse-view novel view synthesis is fundamentally ill-posed due to geometric ambiguity: regression-based methods preserve geometry faithfully but lack completeness, whereas generative methods enable scene completion yet often introduce structural inconsistencies. To address this, we propose a “generative–verificatory” framework that, for the first time, synergistically integrates a pre-trained 3D-aware diffusion model with multi-view stereo (MVS) attention—leveraging the diffusion model to supply semantically coherent scene priors, and employing MVS attention maps as a geometric oracle to quantify 3D uncertainty and guide Gaussian splatting optimization. We further design an uncertainty-weighted loss that adaptively fuses generative priors with geometric evidence, effectively suppressing hallucinated artifacts while ensuring geometrically plausible completion. On the Mip-NeRF 360 and NeRF Synthetic benchmarks, our method achieves state-of-the-art performance in both reconstruction completeness and geometric accuracy.
📝 Abstract
Sparse-view novel view synthesis is fundamentally ill-posed due to severe geometric ambiguity. Current methods are caught in a trade-off: regressive models are geometrically faithful but incomplete, whereas generative models can complete scenes but often introduce structural inconsistencies. We propose OracleGS, a novel framework that reconciles generative completeness with regressive fidelity for sparse view Gaussian Splatting. Instead of using generative models to patch incomplete reconstructions, our "propose-and-validate" framework first leverages a pre-trained 3D-aware diffusion model to synthesize novel views to propose a complete scene. We then repurpose a multi-view stereo (MVS) model as a 3D-aware oracle to validate the 3D uncertainties of generated views, using its attention maps to reveal regions where the generated views are well-supported by multi-view evidence versus where they fall into regions of high uncertainty due to occlusion, lack of texture, or direct inconsistency. This uncertainty signal directly guides the optimization of a 3D Gaussian Splatting model via an uncertainty-weighted loss. Our approach conditions the powerful generative prior on multi-view geometric evidence, filtering hallucinatory artifacts while preserving plausible completions in under-constrained regions, outperforming state-of-the-art methods on datasets including Mip-NeRF 360 and NeRF Synthetic.