ReconDrive: Fast Feed-Forward 4D Gaussian Splatting for Autonomous Driving Scene Reconstruction

📅 2026-03-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the trade-off between efficiency and fidelity in existing 4D Gaussian splatting methods for autonomous driving scene reconstruction, where per-scene optimization lacks scalability and feedforward approaches suffer from insufficient photometric accuracy. To overcome these limitations, we propose ReconDrive—a feedforward framework built upon the 3D foundation model VGGT—that achieves efficient, high-fidelity reconstruction by decoupling spatial coordinate and appearance attribute prediction, introducing a mixture-of-Gaussians prediction head, and designing an explicit motion-aware 4D synthesis mechanism that jointly models static and dynamic elements. Experiments on nuScenes demonstrate that ReconDrive significantly outperforms current feedforward methods, achieving reconstruction quality, novel-view synthesis, and 3D perception performance comparable to per-scene optimization while offering orders-of-magnitude faster inference.

Technology Category

Application Category

📝 Abstract
High-fidelity visual reconstruction and novel-view synthesis are essential for realistic closed-loop evaluation in autonomous driving. While 4D Gaussian Splatting (4DGS) offers a promising balance of accuracy and efficiency, existing per-scene optimization methods require costly iterative refinement, rendering them unscalable for extensive urban environments. Conversely, current feed-forward approaches often suffer from degraded photometric quality. To address these limitations, we propose ReconDrive, a feed-forward framework that leverages and extends the 3D foundation model VGGT for rapid, high-fidelity 4DGS generation. Our architecture introduces two core adaptations to tailor the foundation model to dynamic driving scenes: (1) Hybrid Gaussian Prediction Heads, which decouple the regression of spatial coordinates and appearance attributes to overcome the photometric deficiencies inherent in generalized foundation features; and (2) a Static-Dynamic 4D Composition strategy that explicitly captures temporal motion via velocity modeling to represent complex dynamic environments. Benchmarked on nuScenes, ReconDrive significantly outperforms existing feed-forward baselines in reconstruction, novel-view synthesis, and 3D perception. It achieves performance competitive with per-scene optimization while being orders of magnitude faster, providing a scalable and practical solution for realistic driving simulation.
Problem

Research questions and friction points this paper is trying to address.

4D Gaussian Splatting
autonomous driving
scene reconstruction
feed-forward
scalability
Innovation

Methods, ideas, or system contributions that make the work stand out.

4D Gaussian Splatting
feed-forward reconstruction
Hybrid Gaussian Prediction Heads
Static-Dynamic 4D Composition
autonomous driving simulation
🔎 Similar Papers
No similar papers found.
H
Haibao Yu
Tuojing Intelligence, The University of Hong Kong
K
Kuntao Xiao
Tuojing Intelligence
J
Jiahang Wang
Tuojing Intelligence
R
Ruiyang Hao
Tuojing Intelligence, King’s College London
Yuxin Huang
Yuxin Huang
Unknown affiliation
G
Guoran Hu
Tuojing Intelligence, Mohamed bin Zayed University of Artificial Intelligence
H
Haifang Qin
Tuojing Intelligence
Bowen Jing
Bowen Jing
Massachusetts Institute of Technology
Deep learningmachine learningcomputational biology
Y
Yuntian Bo
Tuojing Intelligence
Ping Luo
Ping Luo
National University of Defense Technology
distributed_computing