🤖 AI Summary
This work addresses the limitations of existing end-to-end automatic parking methods, which often neglect explicit modeling of vehicle motion states, resulting in physically implausible trajectories and non-human-like behavior—particularly at gear-shift points during multi-stage parking maneuvers. To overcome this, we propose a dual-branch end-to-end architecture that jointly predicts continuous spatial trajectories and discrete motion state sequences (e.g., forward/reverse), thereby introducing explicit motion state modeling into end-to-end parking for the first time. Additionally, we incorporate a Fourier feature-based representation of parking slots, which transcends the resolution constraints of conventional bird’s-eye-view (BEV) representations and enhances target interaction accuracy. Experiments on the CARLA simulation platform demonstrate that our approach generates more robust and human-like trajectories in complex multi-stage parking scenarios, significantly improving gear-shift point localization accuracy. We also release a dedicated parking dataset to support future research.
📝 Abstract
Autonomous parking fundamentally differs from on-road driving due to its frequent direction changes and complex maneuvering requirements. However, existing End-to-End (E2E) planning methods often simplify the parking task into a geometric path regression problem, neglecting explicit modeling of the vehicle's kinematic state. This "dimensionality deficiency" easily leads to physically infeasible trajectories and deviates from real human driving behavior, particularly at critical gear-shift points in multi-shot parking scenarios. In this paper, we propose SunnyParking, a novel dual-branch E2E architecture that achieves motion state awareness by jointly predicting spatial trajectories and discrete motion state sequences (e.g., forward/reverse). Additionally, we introduce a Fourier feature-based representation of target parking slots to overcome the resolution limitations of traditional bird's-eye view (BEV) approaches, enabling high-precision target interactions. Experimental results demonstrate that our framework generates more robust and human-like trajectories in complex multi-shot parking scenarios, while significantly improving gear-shift point localization accuracy compared to state-of-the-art methods. We open-source a new parking dataset of the CARLA simulator, specifically designed to evaluate full prediction capabilities under complex maneuvers.