Efficient Real-World Autonomous Racing via Attenuated Residual Policy Optimization

📅 2026-03-13

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

This work addresses the challenges of system complexity and inference latency in traditional residual policy learning for autonomous racing, which hinder efficient deployment. The authors propose attenuation-based Residual Policy Optimization (α-RPO), a method that progressively diminishes the influence of a base policy during training, ultimately yielding a standalone neural network policy. By treating the base policy solely as a training guide rather than a runtime dependency, α-RPO simplifies the deployment architecture and naturally supports multimodal privileged learning, seamlessly integrating into the Proximal Policy Optimization (PPO) framework. Evaluated on a 1:10-scale autonomous racing platform, α-RPO demonstrates significant performance improvements over baseline methods in both simulation and real-world environments, achieving efficient zero-shot sim-to-real transfer while reducing system complexity and enhancing driving performance.

Technology Category

Application Category

📝 Abstract

Residual policy learning (RPL), in which a learned policy refines a static base policy using deep reinforcement learning (DRL), has shown strong performance across various robotic applications. Its effectiveness is particularly evident in autonomous racing, a domain that serves as a challenging benchmark for real-world DRL. However, deploying RPL-based controllers introduces system complexity and increases inference latency. We address this by introducing an extension of RPL named attenuated residual policy optimization ($α$-RPO). Unlike standard RPL, $α$-RPO yields a standalone neural policy by progressively attenuating the base policy, which initially serves to bootstrap learning. Furthermore, this mechanism enables a form of privileged learning, where the base policy is permitted to use sensor modalities not required for final deployment. We design $α$-RPO to integrate seamlessly with PPO, ensuring that the attenuated influence of the base controller is dynamically compensated during policy optimization. We evaluate $α$-RPO by building a framework for 1:10-scaled autonomous racing around it. In both simulation and zero-shot real-world transfer to Roboracer cars, $α$-RPO not only reduces system complexity but also improves driving performance compared to baselines - demonstrating its practicality for robotic deployment. Our code is available at: https://github.com/raphajaner/arpo_racing.

Problem

Research questions and friction points this paper is trying to address.

residual policy learning

autonomous racing

inference latency

system complexity

real-world deployment

Innovation

Methods, ideas, or system contributions that make the work stand out.

attenuated residual policy optimization

residual policy learning

autonomous racing