🤖 AI Summary
To address geometric distortions and artifacts in 3D reconstruction and novel-view synthesis under extreme viewpoint extrapolation—caused by representational limitations of current 3D models—this paper proposes Difix, a universal neural enhancement framework built upon a single-step diffusion model. Difix innovatively employs the diffusion model as a plug-and-play enhancer, jointly applied during training (for pseudo-view purification) and inference (for residual artifact suppression), and is compatible with mainstream 3D representations including NeRF and 3D Gaussian Splatting. Guided by Fréchet Inception Distance (FID) optimization, Difix achieves an average FID reduction of approximately 2× across multiple benchmarks, effectively suppressing artifacts in under-constrained regions while strictly preserving 3D geometric consistency. Its core contribution lies in the first end-to-end, full-pipeline integration of a single-step diffusion model for cooperative enhancement throughout the 3D reconstruction workflow—establishing an efficient, general-purpose, and geometry-aware enhancement paradigm for novel-view synthesis.
📝 Abstract
Neural Radiance Fields and 3D Gaussian Splatting have revolutionized 3D reconstruction and novel-view synthesis task. However, achieving photorealistic rendering from extreme novel viewpoints remains challenging, as artifacts persist across representations. In this work, we introduce Difix3D+, a novel pipeline designed to enhance 3D reconstruction and novel-view synthesis through single-step diffusion models. At the core of our approach is Difix, a single-step image diffusion model trained to enhance and remove artifacts in rendered novel views caused by underconstrained regions of the 3D representation. Difix serves two critical roles in our pipeline. First, it is used during the reconstruction phase to clean up pseudo-training views that are rendered from the reconstruction and then distilled back into 3D. This greatly enhances underconstrained regions and improves the overall 3D representation quality. More importantly, Difix also acts as a neural enhancer during inference, effectively removing residual artifacts arising from imperfect 3D supervision and the limited capacity of current reconstruction models. Difix3D+ is a general solution, a single model compatible with both NeRF and 3DGS representations, and it achieves an average 2$ imes$ improvement in FID score over baselines while maintaining 3D consistency.