Agents that Matter: Optimizing Multi-Agent LLMs via Removal-Based Attribution

📅 2026-05-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the lack of a unified attribution method for individual contributions in multi-agent systems, which hinders performance optimization and cost control. The authors formalize agent attribution as a cooperative game and propose a removal-based unified framework that reveals induced game-theoretic properties through distinct removal protocols. To disentangle diagnostic accuracy from ethical behavior, they introduce model-replacement attribution. By integrating Leave-One-Out attribution, agent ablation, and large language model introspection, the framework enables both contribution evaluation and targeted intervention. Experiments demonstrate that the approach improves task performance by up to 17% and reduces costs by up to 35% across three benchmarks, while significantly enhancing ethical alignment in medical multi-agent systems without compromising diagnostic accuracy.
📝 Abstract
As multi-agent systems (MAS) become increasingly complex, identifying the contributions of individual agents is critical for system optimization. However, existing approaches lack a rigorous, unified framework for credit assignment. In this work, we formalize agent attribution as a cooperative game, parameterized by the coalition distribution, removal protocol, and target metric. Using this framework, we show that Leave-One-Out (LOO) identifies bottleneck agents as effectively as combinatorial methods, but at a fraction of the computational cost. We also demonstrate that removal protocols induce distinct games: Agent ablation isolates structural bottlenecks, whereas introspective LLM judges fail to faithfully approximate this behavior. Furthermore, to evaluate the utility of specific agent backbones, we introduce attribution via model replacement. By substituting underlying models of low-contribution agents, we improve task performance by up to 17% while reducing cost by up to 35% across three benchmarks. Finally, we apply our framework to audit a medical MAS, revealing that agent contributions to diagnostic accuracy and ethical behavior are often decoupled. By intervening on counterproductive roles, we observe an increase in ethics alignment while maintaining diagnostic accuracy. Overall, this work provides a principled approach for cost-effective MAS attribution and intervention.
Problem

Research questions and friction points this paper is trying to address.

multi-agent systems
agent attribution
credit assignment
contribution evaluation
system optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

removal-based attribution
multi-agent LLMs
Leave-One-Out
model replacement
agent contribution
🔎 Similar Papers