Reinforced Refinement with Self-Aware Expansion for End-to-End Autonomous Driving

📅 2025-06-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
End-to-end autonomous driving faces two key challenges: (1) imitation learning (IL) suffers from poor generalization and lacks post-deployment error correction; (2) reinforcement learning (RL) tends to overfit to hard scenarios, causing catastrophic forgetting and low sample efficiency. To address these, we propose a “Reinforcement Fine-tuning + Self-Aware Expansion” framework, introducing a novel three-stage closed-loop paradigm: (i) general IL pre-training, (ii) residual RL fine-tuning focused exclusively on failure cases, and (iii) plug-and-play self-aware adapters for dynamic, scenario-specific adaptation. This design preserves holistic driving knowledge while enabling targeted optimization for challenging scenarios. Evaluated in both closed-loop simulation and real-world vehicle testing, our method significantly improves long-horizon planning robustness, safety, and cross-scenario generalization—outperforming state-of-the-art end-to-end approaches.

Technology Category

Application Category

📝 Abstract
End-to-end autonomous driving has emerged as a promising paradigm for directly mapping sensor inputs to planning maneuvers using learning-based modular integrations. However, existing imitation learning (IL)-based models suffer from generalization to hard cases, and a lack of corrective feedback loop under post-deployment. While reinforcement learning (RL) offers a potential solution to tackle hard cases with optimality, it is often hindered by overfitting to specific driving cases, resulting in catastrophic forgetting of generalizable knowledge and sample inefficiency. To overcome these challenges, we propose Reinforced Refinement with Self-aware Expansion (R2SE), a novel learning pipeline that constantly refines hard domain while keeping generalizable driving policy for model-agnostic end-to-end driving systems. Through reinforcement fine-tuning and policy expansion that facilitates continuous improvement, R2SE features three key components: 1) Generalist Pretraining with hard-case allocation trains a generalist imitation learning (IL) driving system while dynamically identifying failure-prone cases for targeted refinement; 2) Residual Reinforced Specialist Fine-tuning optimizes residual corrections using reinforcement learning (RL) to improve performance in hard case domain while preserving global driving knowledge; 3) Self-aware Adapter Expansion dynamically integrates specialist policies back into the generalist model, enhancing continuous performance improvement. Experimental results in closed-loop simulation and real-world datasets demonstrate improvements in generalization, safety, and long-horizon policy robustness over state-of-the-art E2E systems, highlighting the effectiveness of reinforce refinement for scalable autonomous driving.
Problem

Research questions and friction points this paper is trying to address.

Improves generalization in end-to-end autonomous driving models
Addresses overfitting and catastrophic forgetting in reinforcement learning
Enhances continuous performance improvement with self-aware expansion
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generalist Pretraining with hard-case allocation
Residual Reinforced Specialist Fine-tuning
Self-aware Adapter Expansion for integration
🔎 Similar Papers
No similar papers found.
H
Haochen Liu
School of Mechanical and Aerospace Engineering, Nanyang Technological University, Singapore, 639798
T
Tianyu Li
OpenDriveLab, the School of Computing and Data Science, The University of Hong Kong, Pokfulam, Hong Kong
H
Haohan Yang
School of Mechanical and Aerospace Engineering, Nanyang Technological University, Singapore, 639798
L
Li Chen
OpenDriveLab, the School of Computing and Data Science, The University of Hong Kong, Pokfulam, Hong Kong
C
Caojun Wang
Shanghai Innovation Institute, Shanghai, 200231
Ke Guo
Ke Guo
Nanyang Technological University
roboticsintelligent traffic systemautonomous driving
Haochen Tian
Haochen Tian
Institute of Automation, Chinese Academy of Sciences
RoboticMultimodalityComputer Vision
H
Hongchen Li
Shanghai Innovation Institute, Shanghai, 200231
H
Hongyang Li
OpenDriveLab, the School of Computing and Data Science, The University of Hong Kong, Pokfulam, Hong Kong
C
Chen Lv
School of Mechanical and Aerospace Engineering, Nanyang Technological University, Singapore, 639798