Hybrid Action Based Reinforcement Learning for Multi-Objective Compatible Autonomous Driving

📅 2025-01-14

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address target bias, poor policy flexibility, and slow convergence in autonomous driving reinforcement learning—caused by monolithic action representations and scalarized single rewards—this paper proposes a hybrid parameterized action space coupled with a multi-objective decoupled critic architecture. Our approach jointly enables abstract decision-making and fine-grained control to ensure execution-layer compatibility across multiple objectives, while uncertainty-driven exploration accelerates optimization-layer convergence. Innovatively integrating parameterized action representations, multi-head critic networks, and Pareto-aware reward aggregation, the method is trained jointly on HighD real-world trajectory data and high-fidelity simulation. Compared to baseline methods, it achieves a 42% improvement in training efficiency and a 31% increase in Pareto-front coverage across driving objectives—significantly enhancing driving efficiency, action consistency, and safety.

Technology Category

Application Category

📝 Abstract

Reinforcement Learning (RL) has shown excellent performance in solving decision-making and control problems of autonomous driving, which is increasingly applied in diverse driving scenarios. However, driving is a multi-attribute problem, leading to challenges in achieving multi-objective compatibility for current RL methods, especially in both policy execution and policy iteration. On the one hand, the common action space structure with single action type limits driving flexibility or results in large behavior fluctuations during policy execution. On the other hand, the multi-attribute weighted single reward function result in the agent's disproportionate attention to certain objectives during policy iterations. To this end, we propose a Multi-objective Ensemble-Critic reinforcement learning method with Hybrid Parametrized Action for multi-objective compatible autonomous driving. Specifically, a parameterized action space is constructed to generate hybrid driving actions, combining both abstract guidance and concrete control commands. A multi-objective critics architecture is constructed considering multiple attribute rewards, to ensure simultaneously focusing on different driving objectives. Additionally, uncertainty-based exploration strategy is introduced to help the agent faster approach viable driving policy. The experimental results in both the simulated traffic environment and the HighD dataset demonstrate that our method can achieve multi-objective compatible autonomous driving in terms of driving efficiency, action consistency, and safety. It enhances the general performance of the driving while significantly increasing training efficiency.

Problem

Research questions and friction points this paper is trying to address.

Reinforcement Learning

Autonomous Driving

Multi-objective Reward System

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-objective Ensemble Reinforcement Learning

Parameterized Actions

Autonomous Driving Performance

🔎 Similar Papers

A Review of Reward Functions for Reinforcement Learning in the context of Autonomous Driving

2024-04-122024 IEEE Intelligent Vehicles Symposium (IV)Citations: 8

Authors to Follow