🤖 AI Summary
Existing implicit scene representation methods for novel view synthesis in sparse urban street scenes suffer from high computational cost and poor generalization, as they rely on dense viewpoint sampling and per-scene optimization.
Method: We propose a depth-guided universal neural rendering framework. It introduces noisy depth priors as 3D geometric guidance—enabling strong robustness to input sparsity—and integrates predicted geometric priors, feature-space alignment, and lightweight test-time optimization to support both feed-forward inference and minute-scale per-scene fine-tuning.
Results: Evaluated on KITTI-360 and Waymo, our method achieves state-of-the-art novel view synthesis quality under sparse-view settings, with significantly lower computational overhead. It balances high fidelity, efficiency, and practical deployability—enabling scalable real-world application in autonomous driving and urban reconstruction scenarios.
📝 Abstract
Recent advances in implicit scene representation enable high-fidelity street view novel view synthesis. However, existing methods optimize a neural radiance field for each scene, relying heavily on dense training images and extensive computation resources. To mitigate this shortcoming, we introduce a new method called Efficient Depth-Guided Urban View Synthesis (EDUS) for fast feed-forward inference and efficient per-scene fine-tuning. Different from prior generalizable methods that infer geometry based on feature matching, EDUS leverages noisy predicted geometric priors as guidance to enable generalizable urban view synthesis from sparse input images. The geometric priors allow us to apply our generalizable model directly in the 3D space, gaining robustness across various sparsity levels. Through comprehensive experiments on the KITTI-360 and Waymo datasets, we demonstrate promising generalization abilities on novel street scenes. Moreover, our results indicate that EDUS achieves state-of-the-art performance in sparse view settings when combined with fast test-time optimization.