Drive Fast, Learn Faster: On-Board RL for High Performance Autonomous Racing

📅 2025-05-12

📈 Citations: 0

✨ Influential: 0

career value

236K/year

🤖 AI Summary

To address the challenge of transferring autonomous racing control from simulation to real-world deployment under high-speed, strongly nonlinear, and strict real-time constraints, this paper proposes the first onboard online reinforcement learning (RL) framework that requires no prior simulation pretraining. Methodologically, we introduce a novel residual RL architecture that synergistically optimizes Soft Actor-Critic with a classical controller; an asynchronous training pipeline coupled with a heuristic delay reward adjustment (HDRA) mechanism and multi-step temporal-difference learning significantly improves sample efficiency and training stability. Evaluated on the F1TENTH physical platform, our system achieves a 11.5% lap-time reduction after only 20 minutes of online training on a real racetrack—setting a new state-of-the-art for end-to-end autonomous racing control—and enables continuous adaptive optimization in dynamic environments.

Technology Category

Application Category

📝 Abstract

Autonomous racing presents unique challenges due to its non-linear dynamics, the high speed involved, and the critical need for real-time decision-making under dynamic and unpredictable conditions. Most traditional Reinforcement Learning (RL) approaches rely on extensive simulation-based pre-training, which faces crucial challenges in transfer effectively to real-world environments. This paper introduces a robust on-board RL framework for autonomous racing, designed to eliminate the dependency on simulation-based pre-training enabling direct real-world adaptation. The proposed system introduces a refined Soft Actor-Critic (SAC) algorithm, leveraging a residual RL structure to enhance classical controllers in real-time by integrating multi-step Temporal-Difference (TD) learning, an asynchronous training pipeline, and Heuristic Delayed Reward Adjustment (HDRA) to improve sample efficiency and training stability. The framework is validated through extensive experiments on the F1TENTH racing platform, where the residual RL controller consistently outperforms the baseline controllers and achieves up to an 11.5 % reduction in lap times compared to the State-of-the-Art (SotA) with only 20 min of training. Additionally, an End-to-End (E2E) RL controller trained without a baseline controller surpasses the previous best results with sustained on-track learning. These findings position the framework as a robust solution for high-performance autonomous racing and a promising direction for other real-time, dynamic autonomous systems.

Problem

Research questions and friction points this paper is trying to address.

Addresses non-linear dynamics in high-speed autonomous racing

Eliminates dependency on simulation-based pre-training for RL

Improves real-time decision-making with enhanced SAC algorithm

Innovation

Methods, ideas, or system contributions that make the work stand out.

On-board RL framework eliminates simulation pre-training

Refined SAC algorithm with residual RL structure

Integrates multi-step TD learning and HDRA

🔎 Similar Papers

No similar papers found.