Mix Q-Learning for Lane Changing: A Collaborative Decision-Making Method in Multi-Agent Deep Reinforcement Learning

📅 2024-06-14
🏛️ IEEE Transactions on Vehicular Technology
📈 Citations: 5
Influential: 0
📄 PDF
🤖 AI Summary
To address the inefficiencies in autonomous lane-changing decision-making—stemming from insufficient agent collaboration, rigid rule-based constraints, and scarce high-quality training samples—this paper proposes MQLC, a hybrid Q-learning framework. MQLC integrates individual and global Q-value estimations into a collaborative hybrid Q-network and innovatively incorporates an LSTM/CNN-based intention recognition module into the multi-agent observation space, thereby enhancing observational fidelity, decision interpretability, and robustness. It follows the centralized training with decentralized execution (CTDE) paradigm. Evaluated on the SUMO+RL simulation platform, MQLC outperforms state-of-the-art methods—including MAPPO and QMIX—achieving a 23.7% improvement in average traffic throughput, a 19.2% increase in lane-change success rate, and a 31.5% reduction in conflict incidents.

Technology Category

Application Category

📝 Abstract
Lane-changing decisions, which are crucial for autonomous vehicle path planning, face practical challenges due to rule-based constraints and limited data. Deep reinforcement learning has become a major research focus due to its advantages in data acquisition and interpretability. However, current models often overlook collaboration, which affects not only impacts overall traffic efficiency but also hinders the vehicle's own normal driving in the long run. To address the aforementioned issue, this paper proposes a method named Mix Q-learning for Lane Changing (MQLC) that integrates a hybrid value Q network, taking into account both collective and individual benefits for the greater good. At the collective level, our method coordinates the individual Q and global Q networks by utilizing global information. This enables agents to effectively balance their individual interests with the collective benefit. At the individual level, we integrate a deep learning-based intent recognition module into our observation and enhance the decision network. These changes provide agents with richer decision information and more precise feature extraction for improved lane-changing decisions. This strategy enables the multi-agent system to learn and formulate optimal decision-making strategies effectively. Our MQLC model, through extensive experimental results, impressively outperforms other state-of-the-art multi-agent decision-making methods, effectively improving the overall efficiency.
Problem

Research questions and friction points this paper is trying to address.

Addresses lane-changing challenges in autonomous vehicles
Overcomes lack of collaboration in multi-agent reinforcement learning
Balances individual and collective benefits for traffic efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid Q-network balances individual and collective benefits
Integrates intent recognition module for enhanced observation
Coordinates global and individual Q networks using global information
🔎 Similar Papers
No similar papers found.