FaithFusion: Harmonizing Reconstruction and Generation via Pixel-wise Information Gain

📅 2025-11-26

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

To address concurrent geometric distortion and photorealistic inconsistency in driving-scene reconstruction under large viewpoint variations, this paper proposes a plug-and-play fusion framework integrating 3D Gaussian Splatting (3DGS) with diffusion models. The core innovation is the first introduction of pixel-wise Expected Information Gain (EIG) as a spatial uncertainty prior, dynamically guiding the diffusion model to prioritize challenging regions for refinement. A pixel-wise weighted bidirectional feedback mechanism further refines the 3DGS geometry using generated outputs—without architectural modifications or external priors. Evaluated on the Waymo dataset, our method achieves state-of-the-art performance: it outperforms prior works across NTA-IoU, NTL-IoU, and FID metrics; notably, under extreme 6-meter lane offset conditions, it attains an FID of 107.47, significantly mitigating geometric drift and over-correction artifacts.

Technology Category

Application Category

📝 Abstract

In controllable driving-scene reconstruction and 3D scene generation, maintaining geometric fidelity while synthesizing visually plausible appearance under large viewpoint shifts is crucial. However, effective fusion of geometry-based 3DGS and appearance-driven diffusion models faces inherent challenges, as the absence of pixel-wise, 3D-consistent editing criteria often leads to over-restoration and geometric drift. To address these issues, we introduce extbf{FaithFusion}, a 3DGS-diffusion fusion framework driven by pixel-wise Expected Information Gain (EIG). EIG acts as a unified policy for coherent spatio-temporal synthesis: it guides diffusion as a spatial prior to refine high-uncertainty regions, while its pixel-level weighting distills the edits back into 3DGS. The resulting plug-and-play system is free from extra prior conditions and structural modifications.Extensive experiments on the Waymo dataset demonstrate that our approach attains SOTA performance across NTA-IoU, NTL-IoU, and FID, maintaining an FID of 107.47 even at 6 meters lane shift. Our code is available at https://github.com/wangyuanbiubiubiu/FaithFusion.

Problem

Research questions and friction points this paper is trying to address.

Maintaining geometric fidelity while synthesizing plausible appearance under viewpoint shifts

Fusing geometry-based 3DGS with appearance-driven diffusion models effectively

Addressing over-restoration and geometric drift in 3D scene reconstruction and generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Pixel-wise Expected Information Gain fusion framework

Guides diffusion as spatial prior for refinement

Distills edits into 3DGS via pixel weighting

🔎 Similar Papers

No similar papers found.