π€ AI Summary
High-fidelity 3D reconstruction of street scenes under fast-moving objects remains challenging in autonomous driving, as existing methods rely either on labor-intensive pose annotations or generalization-poor 3D trackers, compromising robustness.
Method: We propose an end-to-end dynamic scene reconstruction framework that eliminates the need for explicit 3D tracking. Our approach introduces a novel stable tracking module integrating 2D depth-aware object tracking with 3D Gaussian splatting (Street Gaussians), coupled with self-correcting motion learning in implicit feature space to rectify trajectory errors and recover missed detections. Additionally, multi-frame feature fusion enhances temporal consistency.
Results: Evaluated on Waymo-NOTR and KITTI, our method achieves significant improvements in dynamic object reconstruction fidelity and cross-scene generalization, outperforming current state-of-the-art approaches.
π Abstract
Realistic scene reconstruction in driving scenarios poses significant challenges due to fast-moving objects. Most existing methods rely on labor-intensive manual labeling of object poses to reconstruct dynamic objects in canonical space and move them based on these poses during rendering. While some approaches attempt to use 3D object trackers to replace manual annotations, the limited generalization of 3D trackers -- caused by the scarcity of large-scale 3D datasets -- results in inferior reconstructions in real-world settings. In contrast, 2D foundation models demonstrate strong generalization capabilities. To eliminate the reliance on 3D trackers and enhance robustness across diverse environments, we propose a stable object tracking module by leveraging associations from 2D deep trackers within a 3D object fusion strategy. We address inevitable tracking errors by further introducing a motion learning strategy in an implicit feature space that autonomously corrects trajectory errors and recovers missed detections. Experimental results on Waymo-NOTR and KITTI show that our method outperforms existing approaches. Our code will be made publicly available.