🤖 AI Summary
To address the suboptimal tactical decision-making of sampling-based trajectory planners caused by fixed, hand-tuned cost function weights, this paper proposes a hierarchical reinforcement learning framework. In this architecture, a high-level RL agent dynamically modulates the cost function parameters of a low-level sampling-based trajectory planner, enabling adaptive trade-offs between safety and aggressiveness during high-speed autonomous driving. The approach overcomes the limitations of manual parameter tuning, supporting interpretable and real-time policy selection. Evaluated in an autonomous racing simulation, the proposed method achieves a 0% collision rate and reduces overtaking time by up to 60% compared to static planners. It demonstrates substantial improvements in interactive behavior and environmental adaptability, validating its effectiveness for dynamic, high-performance autonomous navigation.
📝 Abstract
Sampling-based trajectory planners are widely used for agile autonomous driving due to their ability to generate fast, smooth, and kinodynamically feasible trajectories. However, their behavior is often governed by a cost function with manually tuned, static weights, which forces a tactical compromise that is suboptimal across the wide range of scenarios encountered in a race. To address this shortcoming, we propose using a Reinforcement Learning (RL) agent as a high-level behavioral selector that dynamically switches the cost function parameters of an analytical, low-level trajectory planner during runtime. We show the effectiveness of our approach in simulation in an autonomous racing environment where our RL-based planner achieved 0% collision rate while reducing overtaking time by up to 60% compared to state-of-the-art static planners. Our new agent now dynamically switches between aggressive and conservative behaviors, enabling interactive maneuvers unattainable with static configurations. These results demonstrate that integrating reinforcement learning as a high-level selector resolves the inherent trade-off between safety and competitiveness in autonomous racing planners. The proposed methodology offers a pathway toward adaptive yet interpretable motion planning for broader autonomous driving applications.