🤖 AI Summary
To address action lag and stale information caused by communication delays in multi-agent reinforcement learning (MARL), this paper proposes an information-value-aware low-latency communication mechanism. The method introduces: (1) a computable Value of Information (VOI) metric that jointly quantifies message timeliness and decision impact; (2) a progressive message reception protocol that dynamically prioritizes high-VOI messages and adaptively adjusts waiting duration; and (3) theoretical guarantees on convergence and bounded end-to-end delay. Experiments in time-critical autonomous driving scenarios demonstrate that the approach reduces latency for high-VOI messages by 37.2% on average, eliminates redundant waiting, and improves both coordination efficiency and MARL policy performance. The method consistently outperforms state-of-the-art baselines across diverse channel conditions.
📝 Abstract
Inter-agent communication serves as an effective mechanism for enhancing performance in collaborative multi-agent reinforcement learning(MARL) systems. However, the inherent communication latency in practical systems induces both action decision delays and outdated information sharing, impeding MARL performance gains, particularly in time-critical applications like autonomous driving. In this work, we propose a Value-of-Information aware Low-latency Communication(VIL2C) scheme that proactively adjusts the latency distribution to mitigate its effects in MARL systems. Specifically, we define a Value of Information (VOI) metric to quantify the importance of delayed message transmission based on each delayed message's importance. Moreover, we propose a progressive message reception mechanism to adaptively adjust the reception duration based on received messages. We derive the optimized VoI aware resource allocation and theoretically prove the performance advantage of the proposed VIL2C scheme. Extensive experiments demonstrate that VIL2C outperforms existing approaches under various communication conditions. These gains are attributed to the low-latency transmission of high-VoI messages via resource allocation and the elimination of unnecessary waiting periods via adaptive reception duration.