MAGIC: Multi-Step Advantage-Gated Causal Influence for Multi-agent Reinforcement Learning

📅 2026-05-03

📈 Citations: 0

✨ Influential: 0

career value

229K/year

🤖 AI Summary

This work addresses the challenge in multi-agent reinforcement learning where agents lack effective coordination signals and struggle to quantify their long-term causal influence on others. The authors propose a novel intrinsic reward mechanism that uniquely integrates multi-step causal influence with advantage gating. Specifically, multi-step causal effects among agents are estimated through causal intervention and conditional mutual information, and these effects are then gated by an advantage function to generate alignment-aware intrinsic rewards that precisely incentivize beneficial cooperative behaviors. By moving beyond the limitations of conventional reward design, the method achieves significant performance gains over existing approaches across standard benchmarks—including MPE, SMAC, and SMACv2—with an average improvement of at least 10.1% on primary evaluation metrics.

📝 Abstract

A key challenge in multi-agent reinforcement learning (MARL) lies in designing learning signals that effectively promote coordination among agents. Designing such signals necessitates the ability to quantify the true, long-term causal influence between agents. To address this, we introduce Multi-step Advantage-Gated Interventional Causal MARL (MAGIC), a framework that extracts multi-step causal influences between agents and selectively converts them into intrinsic rewards. MAGIC uses causal intervention with conditional mutual information to quantify long-horizon agent influence, and introduces an advantage-based gating mechanism to ensure exploration is directed toward beneficial, goal-aligned behaviors. Experiments across multiple standard MARL benchmarks and task families, including MPE and SMAC/SMACv2, demonstrate that MAGIC outperforms state-of-the-art methods by a significant margin, achieving an improvement of at least 10.1% in the main evaluation metric.

Problem

Research questions and friction points this paper is trying to address.

multi-agent reinforcement learning

coordination

causal influence

learning signals

long-term influence

Innovation

Methods, ideas, or system contributions that make the work stand out.

causal influence

multi-agent reinforcement learning

intrinsic reward

advantage gating

conditional mutual information

🔎 Similar Papers

No similar papers found.