🤖 AI Summary
To address the challenges of human-dependent reward design, poor generalizability, and regulatory non-compliance in autonomous driving reinforcement learning, this paper proposes a responsibility-oriented reward modeling paradigm. Our method pioneers the deep integration of traffic regulation knowledge graphs with vision-language models (VLMs), augmented by retrieval-augmented generation (RAG) to enable fine-grained attribution of liability in accident scenarios—thereby automatically generating dynamic, regulation-compliant reward signals. Embedded within a deep reinforcement learning framework, the approach improves liability attribution accuracy by +23.6%, reduces the agent’s attributable responsibility in complex traffic incidents by 41.2%, and enhances decision-making compliance and safety. The core contribution lies in establishing a closed-loop reward design paradigm wherein regulatory logic is interpretable, liability attribution is traceable, and reward signals are autonomously generative—thereby bridging formal traffic regulations with data-driven policy optimization.
📝 Abstract
Reinforcement learning (RL) in autonomous driving employs a trial-and-error mechanism, enhancing robustness in unpredictable environments. However, crafting effective reward functions remains challenging, as conventional approaches rely heavily on manual design and demonstrate limited efficacy in complex scenarios. To address this issue, this study introduces a responsibility-oriented reward function that explicitly incorporates traffic regulations into the RL framework. Specifically, we introduced a Traffic Regulation Knowledge Graph and leveraged Vision-Language Models alongside Retrieval-Augmented Generation techniques to automate reward assignment. This integration guides agents to adhere strictly to traffic laws, thus minimizing rule violations and optimizing decision-making performance in diverse driving conditions. Experimental validations demonstrate that the proposed methodology significantly improves the accuracy of assigning accident responsibilities and effectively reduces the agent's liability in traffic incidents.