ADReFT: Adaptive Decision Repair for Safe Autonomous Driving via Reinforcement Fine-Tuning

📅 2025-06-30

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

Autonomous driving systems (ADS) face persistent safety-critical risks due to inherent limitations; existing online repair methods suffer from poor generalizability and overly conservative policies, failing to balance safety assurance with driving comfort. This paper proposes an adaptive decision-repair framework based on reinforcement learning (RL) fine-tuning. First, a Transformer-based dual-head architecture—comprising a state monitoring head and a decision adaptation head—is pretrained via supervised learning on coarse-grained labeled data to establish foundational repair capability. Subsequently, RL fine-tuning enables accurate high-risk state identification and context-aware generation of mitigation actions. Experimental results demonstrate that the method significantly reduces safety violation rates while preserving naturalness and smoothness of driving behavior, thereby enhancing both operational safety and overall ADS performance.

Technology Category

Application Category

📝 Abstract

Autonomous Driving Systems (ADSs) continue to face safety-critical risks due to the inherent limitations in their design and performance capabilities. Online repair plays a crucial role in mitigating such limitations, ensuring the runtime safety and reliability of ADSs. Existing online repair solutions enforce ADS compliance by transforming unacceptable trajectories into acceptable ones based on predefined specifications, such as rule-based constraints or training datasets. However, these approaches often lack generalizability, adaptability and tend to be overly conservative, resulting in ineffective repairs that not only fail to mitigate safety risks sufficiently but also degrade the overall driving experience. To address this issue, we propose Adaptive Decision Repair (ADReFT), a novel and effective repair method that identifies safety-critical states through offline learning from failed tests and generates appropriate mitigation actions to improve ADS safety. Specifically, ADReFT incorporates a transformer-based model with two joint heads, State Monitor and Decision Adapter, designed to capture complex driving environment interactions to evaluate state safety severity and generate adaptive repair actions. Given the absence of oracles for state safety identification, we first pretrain ADReFT using supervised learning with coarse annotations, i.e., labeling states preceding violations as positive samples and others as negative samples. It establishes ADReFT's foundational capability to mitigate safety-critical violations, though it may result in somewhat conservative mitigation strategies. Therefore, we subsequently finetune ADReFT using reinforcement learning to improve its initial capability and generate more precise and contextually appropriate repair decisions. Our evaluation results illustrate that ADReFT achieves better repair performance.

Problem

Research questions and friction points this paper is trying to address.

Improving safety in Autonomous Driving Systems (ADSs)

Addressing lack of generalizability in online repair solutions

Reducing conservatism in safety-critical state mitigation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer-based model with dual heads

Offline learning from failed tests

Reinforcement fine-tuning for precise repairs

🔎 Similar Papers

Research on Autonomous Driving Decision-making Strategies based Deep Reinforcement Learning