🤖 AI Summary
This study addresses the limited generalization of existing adaptive multi-agent systems in cross-domain tasks, which often exhibit a “superficially effective but intrinsically flawed” coordination illusion. Through large-scale empirical analysis, it identifies two core issues—topological overfitting and illusory coordination—for the first time. Employing cross-domain task evaluation, interaction behavior analysis, and system performance comparison, the work systematically assesses generalization capabilities and collaborative mechanisms. Findings reveal that current systems struggle to transfer effectively to novel domains and display internal interactions that significantly deviate from ideal multi-agent coordination patterns. These results challenge the reliability of deploying such systems in real-world settings and advocate for a paradigm shift in evaluation—from mere outcome correctness toward intrinsic generalization capacity.
📝 Abstract
Adaptive multi-agent systems (MAS) are increasingly adopted to tackle complex problems.However, the narrow task coverage of their optimization raises the question of whether they can function as general-purpose systems.To address this gap, we conduct an extensive empirical study of adaptive MAS, revealing two key findings: (1) topological overfitting -- they fail to generalize across different domains; and (2) illusory coordination -- they achieve reasonable surface-level accuracy while the underlying agent interactions diverge from ideal MAS behavior, raising concerns about their practical utility.These findings highlight the pressing need to prioritize generalization in MAS development and motivate evaluation protocols that extend beyond simple final-answer correctness.