I2V-GS: Infrastructure-to-Vehicle View Transformation with Gaussian Splatting for Autonomous Driving Data Generation

📅 2025-07-31

📈 Citations: 0

✨ Influential: 0

career value

242K/year

🤖 AI Summary

To address the high cost and low efficiency of onboard data collection, this paper proposes a novel high-fidelity view synthesis paradigm bridging the infrastructure view to the vehicle view. We introduce the first Gaussian splatting–based cross-view generation framework, integrating adaptive depth warping, cascaded image inpainting, and diffusion modeling, augmented by a cross-view confidence-guided optimization mechanism to ensure multi-view consistency and joint geometric-appearance fidelity. To support training and evaluation, we construct RoadSight—the first real-world, multimodal, multi-view dataset for road场景 understanding. Experiments demonstrate that our method outperforms StreetGaussian by 45.7%, 34.2%, and 14.9% on NTA-IoU, NTL-IoU, and FID, respectively, significantly enhancing synthetic data quality and downstream task utility.

Technology Category

Application Category

📝 Abstract

Vast and high-quality data are essential for end-to-end autonomous driving systems. However, current driving data is mainly collected by vehicles, which is expensive and inefficient. A potential solution lies in synthesizing data from real-world images. Recent advancements in 3D reconstruction demonstrate photorealistic novel view synthesis, highlighting the potential of generating driving data from images captured on the road. This paper introduces a novel method, I2V-GS, to transfer the Infrastructure view To the Vehicle view with Gaussian Splatting. Reconstruction from sparse infrastructure viewpoints and rendering under large view transformations is a challenging problem. We adopt the adaptive depth warp to generate dense training views. To further expand the range of views, we employ a cascade strategy to inpaint warped images, which also ensures inpainting content is consistent across views. To further ensure the reliability of the diffusion model, we utilize the cross-view information to perform a confidenceguided optimization. Moreover, we introduce RoadSight, a multi-modality, multi-view dataset from real scenarios in infrastructure views. To our knowledge, I2V-GS is the first framework to generate autonomous driving datasets with infrastructure-vehicle view transformation. Experimental results demonstrate that I2V-GS significantly improves synthesis quality under vehicle view, outperforming StreetGaussian in NTA-Iou, NTL-Iou, and FID by 45.7%, 34.2%, and 14.9%, respectively.

Problem

Research questions and friction points this paper is trying to address.

Transforming infrastructure views to vehicle views for autonomous driving data

Overcoming challenges in sparse viewpoint reconstruction and large view transformation

Generating high-quality synthetic driving data using Gaussian Splatting technique

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Gaussian Splatting for view transformation

Employs adaptive depth warp for dense views

Utilizes cross-view confidence-guided optimization

🔎 Similar Papers

No similar papers found.