🤖 AI Summary
To address the insufficient decision robustness in multi-vehicle cooperative planning for intelligent transportation systems—caused by coupled uncertainties in perception, planning, and communication—this paper proposes GRU-SAC, a deep reinforcement learning framework integrating Gated Recurrent Units (GRUs) with Soft Actor-Critic (SAC). GRU-SAC explicitly models temporal uncertainty via GRUs, enhancing latent-state inference under imperfect state observations and enabling learning of time-varying optimal cooperative actions. Experiments on the CARLA simulation platform demonstrate that, compared to baseline methods, GRU-SAC improves cooperative success rates by 19.3% in complex scenarios such as intersection traversal and emergency evasive maneuvers, reduces decision latency by 27.6%, and exhibits strong robustness against communication packet loss rates up to 30% and perception errors within ±1.5 m. These results significantly enhance safety and adaptability in multi-vehicle cooperative driving.
📝 Abstract
In future intelligent transportation systems, autonomous cooperative planning (ACP), becomes a promising technique to increase the effectiveness and security of multi-vehicle interactions. However, multiple uncertainties cannot be fully addressed for existing ACP strategies, e.g. perception, planning, and communication uncertainties. To address these, a novel deep reinforcement learning-based autonomous cooperative planning (DRLACP) framework is proposed to tackle various uncertainties on cooperative motion planning schemes. Specifically, the soft actor-critic (SAC) with the implementation of gate recurrent units (GRUs) is adopted to learn the deterministic optimal time-varying actions with imperfect state information occurred by planning, communication, and perception uncertainties. In addition, the real-time actions of autonomous vehicles (AVs) are demonstrated via the Car Learning to Act (CARLA) simulation platform. Evaluation results show that the proposed DRLACP learns and performs cooperative planning effectively, which outperforms other baseline methods under different scenarios with imperfect AV state information.