ROAD: Responsibility-Oriented Reward Design for Reinforcement Learning in Autonomous Driving

📅 2025-05-30

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

To address the challenges of human-dependent reward design, poor generalizability, and regulatory non-compliance in autonomous driving reinforcement learning, this paper proposes a responsibility-oriented reward modeling paradigm. Our method pioneers the deep integration of traffic regulation knowledge graphs with vision-language models (VLMs), augmented by retrieval-augmented generation (RAG) to enable fine-grained attribution of liability in accident scenarios—thereby automatically generating dynamic, regulation-compliant reward signals. Embedded within a deep reinforcement learning framework, the approach improves liability attribution accuracy by +23.6%, reduces the agent’s attributable responsibility in complex traffic incidents by 41.2%, and enhances decision-making compliance and safety. The core contribution lies in establishing a closed-loop reward design paradigm wherein regulatory logic is interpretable, liability attribution is traceable, and reward signals are autonomously generative—thereby bridging formal traffic regulations with data-driven policy optimization.

Technology Category

Application Category

📝 Abstract

Reinforcement learning (RL) in autonomous driving employs a trial-and-error mechanism, enhancing robustness in unpredictable environments. However, crafting effective reward functions remains challenging, as conventional approaches rely heavily on manual design and demonstrate limited efficacy in complex scenarios. To address this issue, this study introduces a responsibility-oriented reward function that explicitly incorporates traffic regulations into the RL framework. Specifically, we introduced a Traffic Regulation Knowledge Graph and leveraged Vision-Language Models alongside Retrieval-Augmented Generation techniques to automate reward assignment. This integration guides agents to adhere strictly to traffic laws, thus minimizing rule violations and optimizing decision-making performance in diverse driving conditions. Experimental validations demonstrate that the proposed methodology significantly improves the accuracy of assigning accident responsibilities and effectively reduces the agent's liability in traffic incidents.

Problem

Research questions and friction points this paper is trying to address.

Designing effective reward functions for autonomous driving RL

Incorporating traffic regulations into RL framework automatically

Reducing rule violations and optimizing driving decisions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Responsibility-oriented reward function design

Traffic Regulation Knowledge Graph integration

Vision-Language Models for reward automation

🔎 Similar Papers

A Review of Reward Functions for Reinforcement Learning in the context of Autonomous Driving