🤖 AI Summary
In autonomous driving, perception uncertainty induced by dynamic and static occlusions severely compromises decision-making reliability. Existing occlusion-aware perception methods suffer from high computational complexity, poor generalization, reliance on expert-annotated data, and lack of predictive capability. This paper proposes Pad-AI, a novel framework grounded in active perception that jointly learns vectorized environmental representations and semantic motion primitives to enable efficient high-order exploration. Crucially, Pad-AI pioneers the unified integration of predictive modeling and reinforcement learning into a joint prediction–decision architecture, enabling risk-aware policy learning and formal safety guarantees. Evaluated in complex closed-loop scenarios, Pad-AI significantly outperforms strong baselines across three critical dimensions: scene generalization, sample efficiency, and safety—demonstrating breakthrough performance in all.
📝 Abstract
Occlusion-aware decision-making is essential in autonomous driving due to the high uncertainty of various occlusions. Recent occlusion-aware decision-making methods encounter issues such as high computational complexity, scenario scalability challenges, or reliance on limited expert data. Benefiting from automatically generating data by exploration randomization, we uncover that reinforcement learning (RL) may show promise in occlusion-aware decision-making. However, previous occlusion-aware RL faces challenges in expanding to various dynamic and static occlusion scenarios, low learning efficiency, and lack of predictive ability. To address these issues, we introduce Pad-AI, a self-reinforcing framework to learn occlusion-aware decision-making through active perception. Pad-AI utilizes vectorized representation to represent occluded environments efficiently and learns over the semantic motion primitives to focus on high-level active perception exploration. Furthermore, Pad-AI integrates prediction and RL within a unified framework to provide risk-aware learning and security guarantees. Our framework was tested in challenging scenarios under both dynamic and static occlusions and demonstrated efficient and general perception-aware exploration performance to other strong baselines in closed-loop evaluations.