🤖 AI Summary
Existing methods for urban scene reconstruction suffer from poor rendering quality under large viewpoint shifts and limited generalization, hindering their applicability in closed-loop simulation. This work proposes GenRe—a diffusion-based universal enhancer that, for the first time, learns cross-scene diffusion priors and efficiently distills them into any pre-trained 3D Gaussian representation without per-scene optimization. GenRe significantly improves rendering fidelity and viewpoint generalization under challenging conditions—such as lane changes—within minutes, outperforming current approaches in both reconstruction quality and computational efficiency. The method effectively supports downstream applications including autonomous driving sensor simulation.
📝 Abstract
Urban scene reconstruction from real-world observations has emerged as a powerful tool for self-driving development and testing. While current neural rendering approaches achieve high-fidelity rendering along the recorded trajectories, their quality degrades significantly under large viewpoint shifts, limiting the applicability for closed-loop simulation. Recent works have shown promising results in using diffusion models to enhance quality at these challenging viewpoints and distill improvements back into 3D representations. However, they often require costly per-scene optimization, and the distilled representations remain fragile and fail to generalize beyond limited synthesized views. To address these limitations, we propose GenRe, a novel diffusion-guided generalizable enhancer for urban scene reconstruction. GenRe takes as input any pretrained 3D Gaussian representation and fixes the deficiencies within a few minutes. By learning to distill generative priors across diverse scenes, GenRe produces robust and high-fidelity representation efficiently that generalizes reliably to challenging unseen viewpoints (e.g., lane change). Experiments show that GenRe outperforms existing methods in both quality and efficiency and benefits various downstream tasks, enabling robust and scalable sensor simulation for autonomous driving.