🤖 AI Summary
To address the insufficient robustness of end-to-end autonomous parking under domain shifts—such as varying weather and illumination conditions—this paper proposes Dino-Diffusion Parking, a domain-agnostic parking framework. Our method innovatively integrates the DINO vision foundation model, which extracts domain-invariant features to enhance cross-domain perception, with a diffusion model that generates robust trajectory plans. Leveraging CARLA-based simulation training and 3D Gaussian Splatting, we enable effective sim-to-real transfer. The modular architecture significantly improves zero-shot cross-domain generalization: parking success rates consistently exceed 90% across diverse out-of-distribution scenarios—including adverse weather and lighting conditions—and strong real-world transferability is empirically validated in reconstructed physical environments.
📝 Abstract
Parking is a critical pillar of driving safety. While recent end-to-end (E2E) approaches have achieved promising in-domain results, robustness under domain shifts (e.g., weather and lighting changes) remains a key challenge. Rather than relying on additional data, in this paper, we propose Dino-Diffusion Parking (DDP), a domain-agnostic autonomous parking pipeline that integrates visual foundation models with diffusion-based planning to enable generalized perception and robust motion planning under distribution shifts. We train our pipeline in CARLA at regular setting and transfer it to more adversarial settings in a zero-shot fashion. Our model consistently achieves a parking success rate above 90% across all tested out-of-distribution (OOD) scenarios, with ablation studies confirming that both the network architecture and algorithmic design significantly enhance cross-domain performance over existing baselines. Furthermore, testing in a 3D Gaussian splatting (3DGS) environment reconstructed from a real-world parking lot demonstrates promising sim-to-real transfer.