🤖 AI Summary
This work addresses the lack of theoretical grounding in existing sparse coordination graphs for determining edge existence and information capacity, which hinders their ability to capture structural relationships among heterogeneous agents. Building upon the Graph Information Bottleneck (GIB) framework, the authors introduce a group-aligned block-diagonal prior that rigorously constrains the variational bound of topology learning, enabling the construction of a task-driven sparse coordination graph. By decomposing the objective function into group-wise blocks, the method achieves differentiated edge density control and allocates communication bandwidth according to a water-filling principle, preserving only task-relevant information. Experimental results demonstrate that the proposed approach significantly enhances collaboration efficiency and scalability in multi-agent systems while maintaining strong theoretical interpretability.
📝 Abstract
Coordination graphs are a central abstraction in cooperative multi-agent reinforcement learning (MARL), yet existing sparse-graph learners lack a theoretically grounded mechanism to decide which edges should exist and how much information each edge should carry. Current methods rely on heuristic criteria that offer no formal guarantee on the learned topology, and no principled way to allocate different communication capacities to structurally different agent relationships. To address this, we propose Heterogeneous Information-Bottleneck Coordination Graphs (HIBCG), which learns a group-aware sparse graph in which both edge existence and message capacity are theoretically justified. With the graph information bottleneck (GIB) serving as the underlying tool, HIBCG first constructs a group-aligned block-diagonal prior that provides a closed-form criterion for edge retention -- determining which edges should exist and at what density per group block -- and then controls per-agent feature bandwidth on the resulting topology, compressing messages to retain only task-relevant content. We prove that the group-aligned prior strictly tightens the variational bound on topology learning, that the objective decomposes per group block, enabling differential edge control, and that capacity allocation follows a water-filling principle.