🤖 AI Summary
To address low trajectory feasibility and inefficiency in sampling-based motion planning for complex urban driving scenarios—caused by uniform or heuristic sampling—this paper proposes a reinforcement learning (RL)-guided hybrid sampling framework. Methodologically, it integrates an RL agent with analytical trajectory verification to enable learnable and interpretable guidance of sampling regions; further, it introduces a decodable deep set encoder to construct a compact world model capable of handling dynamic numbers of traffic participants. The approach jointly leverages deterministic feasibility checking, cost optimization, and world model prediction to balance computational efficiency and formal verifiability. Evaluated on the CommonRoad benchmark, our method achieves a 99% reduction in sampled states and an 84% decrease in runtime compared to baselines, while maintaining high task success rates and zero collisions—significantly enhancing real-time performance and safety for autonomous urban driving.
📝 Abstract
Sampling-based motion planning is a well-established approach in autonomous driving, valued for its modularity and analytical tractability. In complex urban scenarios, however, uniform or heuristic sampling often produces many infeasible or irrelevant trajectories. We address this limitation with a hybrid framework that learns where to sample while keeping trajectory generation and evaluation fully analytical and verifiable. A reinforcement learning (RL) agent guides the sampling process toward regions of the action space likely to yield feasible trajectories, while evaluation and final selection remains governed by deterministic feasibility checks and cost functions. We couple the RL sampler with a world model (WM) based on a decodable deep set encoder, enabling both variable numbers of traffic participants and reconstructable latent representations. The approach is evaluated in the CommonRoad simulation environment, showing up to 99% fewer required samples and a runtime reduction of up to 84% while maintaining planning quality in terms of success and collision-free rates. These improvements lead to faster, more reliable decision-making for autonomous vehicles in urban environments, achieving safer and more responsive navigation under real-world constraints. Code and trained artifacts are publicly available at: https://github.com/TUM-AVS/Learning-to-Sample