🤖 AI Summary
To address the inefficiencies in autonomous lane-changing decision-making—stemming from insufficient agent collaboration, rigid rule-based constraints, and scarce high-quality training samples—this paper proposes MQLC, a hybrid Q-learning framework. MQLC integrates individual and global Q-value estimations into a collaborative hybrid Q-network and innovatively incorporates an LSTM/CNN-based intention recognition module into the multi-agent observation space, thereby enhancing observational fidelity, decision interpretability, and robustness. It follows the centralized training with decentralized execution (CTDE) paradigm. Evaluated on the SUMO+RL simulation platform, MQLC outperforms state-of-the-art methods—including MAPPO and QMIX—achieving a 23.7% improvement in average traffic throughput, a 19.2% increase in lane-change success rate, and a 31.5% reduction in conflict incidents.
📝 Abstract
Lane-changing decisions, which are crucial for autonomous vehicle path planning, face practical challenges due to rule-based constraints and limited data. Deep reinforcement learning has become a major research focus due to its advantages in data acquisition and interpretability. However, current models often overlook collaboration, which affects not only impacts overall traffic efficiency but also hinders the vehicle's own normal driving in the long run. To address the aforementioned issue, this paper proposes a method named Mix Q-learning for Lane Changing (MQLC) that integrates a hybrid value Q network, taking into account both collective and individual benefits for the greater good. At the collective level, our method coordinates the individual Q and global Q networks by utilizing global information. This enables agents to effectively balance their individual interests with the collective benefit. At the individual level, we integrate a deep learning-based intent recognition module into our observation and enhance the decision network. These changes provide agents with richer decision information and more precise feature extraction for improved lane-changing decisions. This strategy enables the multi-agent system to learn and formulate optimal decision-making strategies effectively. Our MQLC model, through extensive experimental results, impressively outperforms other state-of-the-art multi-agent decision-making methods, effectively improving the overall efficiency.