🤖 AI Summary
This study investigates how communication mechanisms foster spontaneous cooperation in sequential social dilemmas within the context of quantum multi-agent reinforcement learning. It introduces classical communication protocols—namely MATE, MEDIATE, Gifting, and RIAL—into a quantum Q-learning framework for the first time and systematically evaluates their performance across canonical social dilemma games: the iterated Prisoner’s Dilemma, Stag Hunt, and Chicken Game. Experimental results demonstrate that variants such as MATE_TD, AutoMATE, MEDIATE-I, and MEDIATE-S significantly enhance cooperative behavior among agents. These findings validate the potential of communication protocols to effectively promote cooperation in quantum multi-agent systems and offer novel insights for designing collaborative mechanisms in quantum reinforcement learning.
📝 Abstract
Emergent cooperation in classical Multi-Agent Reinforcement Learning has gained significant attention, particularly in the context of Sequential Social Dilemmas (SSDs). While classical reinforcement learning approaches have demonstrated capability for emergent cooperation, research on extending these methods to Quantum Multi-Agent Reinforcement Learning remains limited, particularly through communication. In this paper, we apply communication approaches to quantum Q-Learning agents: the Mutual Acknowledgment Token Exchange (MATE) protocol, its extension Mutually Endorsed Distributed Incentive Acknowledgment Token Exchange (MEDIATE), the peer rewarding mechanism Gifting, and Reinforced Inter-Agent Learning (RIAL). We evaluate these approaches in three SSDs: the Iterated Prisoner's Dilemma, Iterated Stag Hunt, and Iterated Game of Chicken. Our experimental results show that approaches using MATE with temporal-difference measure (MATE\textsubscript{TD}), AutoMATE, MEDIATE-I, and MEDIATE-S achieved high cooperation levels across all dilemmas, demonstrating that communication is a viable mechanism for fostering emergent cooperation in Quantum Multi-Agent Reinforcement Learning.