🤖 AI Summary
To address the safety–real-time trade-off in collision-free autonomous navigation under partially observable traffic environments, this paper proposes a learning-planning hybrid framework. Methodologically, it innovatively integrates multi-agent behavior prediction, proximal policy optimization (PPO)-based deep reinforcement learning, and heuristic confidence-driven vertical-pruning approximate online POMDP planning: the former enhances environmental modeling fidelity and policy generalization, while the latter ensures decision safety and computational tractability. Evaluated on the CARLA-CTS2 benchmark, the approach significantly outperforms existing baselines—achieving lower collision rates and over an order-of-magnitude faster inference than conventional online POMDP planners. To our knowledge, this is the first method to jointly achieve high safety guarantees and sub-millisecond response latency in partially observable urban driving scenarios.
📝 Abstract
We present a novel hybrid learning-assisted planning method, named HyPlan, for solving the collision-free navigation problem for self-driving cars in partially observable traffic environments. HyPlan combines methods for multi-agent behavior prediction, deep reinforcement learning with proximal policy optimization and approximated online POMDP planning with heuristic confidence-based vertical pruning to reduce its execution time without compromising safety of driving. Our experimental performance analysis on the CARLA-CTS2 benchmark of critical traffic scenarios with pedestrians revealed that HyPlan may navigate safer than selected relevant baselines and perform significantly faster than considered alternative online POMDP planners.