GIFSplat: Generative Prior-Guided Iterative Feed-Forward 3D Gaussian Splatting from Sparse Views

📅 2026-02-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing feedforward 3D reconstruction methods struggle to balance efficiency and quality under sparse-view conditions, and incorporating generative priors often compromises inference speed. This work proposes a purely feedforward iterative refinement framework that progressively enhances a 3D Gaussian splatting representation through a small number of forward residual updates. Generative priors distilled from a frozen diffusion model are injected in the form of per-Gaussian cues, eliminating the need for test-time optimization or camera pose estimation. By moving beyond the limitations of single-pass prediction, the method achieves up to a 2.1 dB PSNR improvement on benchmarks including DL3DV, RealEstate10K, and DTU, while maintaining sub-second inference times.

Technology Category

Application Category

📝 Abstract
Feed-forward 3D reconstruction offers substantial runtime advantages over per-scene optimization, which remains slow at inference and often fragile under sparse views. However, existing feed-forward methods still have potential for further performance gains, especially for out-of-domain data, and struggle to retain second-level inference time once a generative prior is introduced. These limitations stem from the one-shot prediction paradigm in existing feed-forward pipeline: models are strictly bounded by capacity, lack inference-time refinement, and are ill-suited for continuously injecting generative priors. We introduce GIFSplat, a purely feed-forward iterative refinement framework for 3D Gaussian Splatting from sparse unposed views. A small number of forward-only residual updates progressively refine current 3D scene using rendering evidence, achieve favorable balance between efficiency and quality. Furthermore, we distill a frozen diffusion prior into Gaussian-level cues from enhanced novel renderings without gradient backpropagation or ever-increasing view-set expansion, thereby enabling per-scene adaptation with generative prior while preserving feed-forward efficiency. Across DL3DV, RealEstate10K, and DTU, GIFSplat consistently outperforms state-of-the-art feed-forward baselines, improving PSNR by up to +2.1 dB, and it maintains second-scale inference time without requiring camera poses or any test-time gradient optimization.
Problem

Research questions and friction points this paper is trying to address.

feed-forward 3D reconstruction
sparse views
generative prior
inference-time refinement
3D Gaussian Splatting
Innovation

Methods, ideas, or system contributions that make the work stand out.

feed-forward 3D reconstruction
iterative refinement
generative prior
3D Gaussian Splatting
sparse views
🔎 Similar Papers
No similar papers found.
T
Tianyu Chen
La Trobe University, Melbourne, VIC 3086, Australia
Wei Xiang
Wei Xiang
Distinguished Professor, Cisco Research Chair of AI and IoT, La Trobe University
Internet of ThingsMachine LearningWireless Sensor NetworksWireless CommunicationsComputer
K
Kang Han
La Trobe University, Melbourne, VIC 3086, Australia
Y
Yu Lu
La Trobe University, Melbourne, VIC 3086, Australia
D
Di Wu
La Trobe University, Melbourne, VIC 3086, Australia
Gaowen Liu
Gaowen Liu
Cisco Research
machine learningcomputer visionmultimedia.
R
Ramana Rao Kompella
Cisco Research, San Jose, CA, USA