Expert Knowledge-driven Reinforcement Learning for Autonomous Racing via Trajectory Guidance and Dynamics Constraints

📅 2026-03-06

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

This work addresses the challenges of training instability and unsafe action outputs in reinforcement learning for high-speed autonomous racing, a domain characterized by high dynamics and strong nonlinearities. The authors propose TraD-RL, a method that integrates expert trajectory-guided state representation and reward shaping to explicitly embed vehicle dynamics priors, while employing control barrier functions to construct a safety envelope for admissible actions. Coupled with a multi-stage curriculum learning strategy, the model transitions smoothly from expert-guided initialization to autonomous exploration. Evaluated in a high-fidelity simulation environment on the Templehof Airport racetrack, the approach significantly improves lap time and driving stability, ensuring safety while surpassing expert-level performance and achieving a synergistic optimization of racing efficiency and safety constraints.

Technology Category

Application Category

📝 Abstract

Reinforcement learning has demonstrated significant potential in the field of autonomous driving. However, it suffers from defects such as training instability and unsafe action outputs when faced with autonomous racing environments characterized by high dynamics and strong nonlinearities. To this end, this paper proposes a trajectory guidance and dynamics constraints Reinforcement Learning (TraD-RL) method for autonomous racing. The key features of this method are as follows: 1) leveraging the prior expert racing line to construct an augmented state representation and facilitate reward shaping, thereby integrating domain knowledge to stabilize early-stage policy learning; 2) embedding explicit vehicle dynamic priors into a safe operating envelope formulated via control barrier functions to enable safety-constrained learning; and 3) adopting a multi-stage curriculum learning strategy that shifts from expert-guided learning to autonomous exploration, allowing the learned policy to surpass expert-level performance. The proposed method is evaluated in a high-fidelity simulation environment modeled after the Tempelhof Airport Street Circuit. Experimental results demonstrate that TraD-RL effectively improves both lap speed and driving stability of the autonomous racing vehicle, achieving a synergistic optimization of racing performance and safety.

Problem

Research questions and friction points this paper is trying to address.

reinforcement learning

autonomous racing

training instability

unsafe actions

high dynamics

Innovation

Methods, ideas, or system contributions that make the work stand out.

trajectory guidance

dynamics constraints

control barrier functions