🤖 AI Summary
To address the challenges of inaccurate credit assignment and inefficient exploration in multi-agent reinforcement learning (MARL) under sparse rewards, this paper proposes an agent-specific influence scope (ISA) framework for state-dimension-level attribution. ISA decouples state attributes, dynamically quantifies each agent’s influence on individual state dimensions, and jointly models action–state-attribute dependencies to enable fine-grained, time-varying credit assignment and personalized exploration space pruning. Unlike conventional global- or action-level attribution methods, ISA introduces, for the first time, an interpretable, dimension-wise influence quantification mechanism grounded in state representation. Evaluated on multiple sparse-reward MARL benchmarks, ISA consistently outperforms state-of-the-art approaches, achieving an average 37% acceleration in convergence speed and a 12.6% improvement in final performance.
📝 Abstract
Training cooperative agents in sparse-reward scenarios poses significant challenges for multi-agent reinforcement learning (MARL). Without clear feedback on actions at each step in sparse-reward setting, previous methods struggle with precise credit assignment among agents and effective exploration. In this paper, we introduce a novel method to deal with both credit assignment and exploration problems in reward-sparse domains. Accordingly, we propose an algorithm that calculates the Influence Scope of Agents (ISA) on states by taking specific value of the dimensions/attributes of states that can be influenced by individual agents. The mutual dependence between agents' actions and state attributes are then used to calculate the credit assignment and to delimit the exploration space for each individual agent. We then evaluate ISA in a variety of sparse-reward multi-agent scenarios. The results show that our method significantly outperforms the state-of-art baselines.