🤖 AI Summary
To address the fragility of collaboration in multi-agent reinforcement learning (MARL) caused by inconsistent or adversarial communication, this paper introduces, for the first time, power regularization at the communication level—termed *communication-channel power regularization*. Our method explicitly models and constrains inter-agent communication strength via three components: (i) a gradient-aware metric quantifying communication influence, (ii) a soft-constrained regularization term on communication weights, and (iii) a joint optimization framework integrating this regularization into MARL policy gradients. Evaluated on benchmark tasks—including Red-Door-Blue-Door, Predator-Prey, and Grid Coverage—the approach significantly improves robustness under adversarial communication, with collaboration performance degradation limited to under 3%, outperforming existing baselines. The core contribution is the first quantifiable and regulatable communication influence mechanism for MARL, enhancing both system security and resilience without compromising collaborative efficacy.
📝 Abstract
Effective communication in Multi-Agent Reinforcement Learning (MARL) can significantly enhance coordination and collaborative performance in complex and partially observable environments. However, reliance on communication can also introduce vulnerabilities when agents are misaligned, potentially leading to adversarial interactions that exploit implicit assumptions of cooperative intent. Prior work has addressed adversarial behavior through power regularization through controlling the influence one agent exerts over another, but has largely overlooked the role of communication in these dynamics. This paper introduces Communicative Power Regularization (CPR), extending power regularization specifically to communication channels. By explicitly quantifying and constraining agents' communicative influence during training, CPR actively mitigates vulnerabilities arising from misaligned or adversarial communications. Evaluations across benchmark environments Red-Door-Blue-Door, Predator-Prey, and Grid Coverage demonstrate that our approach significantly enhances robustness to adversarial communication while preserving cooperative performance, offering a practical framework for secure and resilient cooperative MARL systems.