🤖 AI Summary
Multi-agent reinforcement learning (MARL) faces three key challenges in dynamic competitive settings—e.g., pursuit-evasion: training instability, poor adversarial robustness, and limited generalization across varying agent scales. To address these, we propose LEGO, the first MARL framework integrating E(n)-equivariant graph neural networks with role-aware representation learning. LEGO achieves rigid-body transformation equivariance via local coordinate normalization, and jointly incorporates permutation equivariance, heterogeneous feature encoding, and relational modeling—while remaining compatible with mainstream algorithms such as MAPPO. Empirically, LEGO significantly improves policy stability and generalization under unseen swarm sizes, node failures, and adversarial perturbations. It consistently outperforms strong baselines across diverse cooperative and competitive tasks. By enabling scalable, robust, and equivariant multi-agent coordination, LEGO establishes a new paradigm for swarm robotics control via MARL.
📝 Abstract
Multi-agent reinforcement learning (MARL) has emerged as a powerful paradigm for coordinating swarms of agents in complex decision-making, yet major challenges remain. In competitive settings such as pursuer-evader tasks, simultaneous adaptation can destabilize training; non-kinetic countermeasures often fail under adverse conditions; and policies trained in one configuration rarely generalize to environments with a different number of agents. To address these issues, we propose the Local-Canonicalization Equivariant Graph Neural Networks (LEGO) framework, which integrates seamlessly with popular MARL algorithms such as MAPPO. LEGO employs graph neural networks to capture permutation equivariance and generalization to different agent numbers, canonicalization to enforce E(n)-equivariance, and heterogeneous representations to encode role-specific inductive biases. Experiments on cooperative and competitive swarm benchmarks show that LEGO outperforms strong baselines and improves generalization. In real-world experiments, LEGO demonstrates robustness to varying team sizes and agent failure.