GIFSplat: Generative Prior-Guided Iterative Feed-Forward 3D Gaussian Splatting from Sparse Views

📅 2026-02-25

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

Existing feedforward 3D reconstruction methods struggle to balance efficiency and quality under sparse-view conditions, and incorporating generative priors often compromises inference speed. This work proposes a purely feedforward iterative refinement framework that progressively enhances a 3D Gaussian splatting representation through a small number of forward residual updates. Generative priors distilled from a frozen diffusion model are injected in the form of per-Gaussian cues, eliminating the need for test-time optimization or camera pose estimation. By moving beyond the limitations of single-pass prediction, the method achieves up to a 2.1 dB PSNR improvement on benchmarks including DL3DV, RealEstate10K, and DTU, while maintaining sub-second inference times.

Technology Category

Application Category

📝 Abstract

Feed-forward 3D reconstruction offers substantial runtime advantages over per-scene optimization, which remains slow at inference and often fragile under sparse views. However, existing feed-forward methods still have potential for further performance gains, especially for out-of-domain data, and struggle to retain second-level inference time once a generative prior is introduced. These limitations stem from the one-shot prediction paradigm in existing feed-forward pipeline: models are strictly bounded by capacity, lack inference-time refinement, and are ill-suited for continuously injecting generative priors. We introduce GIFSplat, a purely feed-forward iterative refinement framework for 3D Gaussian Splatting from sparse unposed views. A small number of forward-only residual updates progressively refine current 3D scene using rendering evidence, achieve favorable balance between efficiency and quality. Furthermore, we distill a frozen diffusion prior into Gaussian-level cues from enhanced novel renderings without gradient backpropagation or ever-increasing view-set expansion, thereby enabling per-scene adaptation with generative prior while preserving feed-forward efficiency. Across DL3DV, RealEstate10K, and DTU, GIFSplat consistently outperforms state-of-the-art feed-forward baselines, improving PSNR by up to +2.1 dB, and it maintains second-scale inference time without requiring camera poses or any test-time gradient optimization.

Problem

Research questions and friction points this paper is trying to address.

feed-forward 3D reconstruction

sparse views

generative prior

inference-time refinement

3D Gaussian Splatting

Innovation

Methods, ideas, or system contributions that make the work stand out.

feed-forward 3D reconstruction

iterative refinement

generative prior