🤖 AI Summary
This work addresses a counterintuitive phenomenon in heterogeneous large language model (LLM) multi-agent systems, where collaboration between strong and weak models often underperforms compared to weak–weak pairings due to cognitive mismatch. The study is the first to identify and characterize this issue, proposing an adaptive guidance framework grounded in multidimensional entropy—encompassing expressiveness, uncertainty, structure, coherence, and relevance—to dynamically assess the comprehension state of weaker agents and modulate guidance intensity accordingly. Additionally, an experience retrieval mechanism is introduced to enable both immediate adaptation and long-term collaborative memory. Evaluated on three benchmarks—GSM8K, MBPP, and CVRP—the approach significantly enhances the effectiveness and stability of heterogeneous multi-agent collaboration, demonstrating strong generalizability and scalability.
📝 Abstract
With recent breakthroughs in large language models (LLMs) for reasoning, planning, and complex task generation, artificial intelligence systems are transitioning from isolated single-agent architectures to multi-agent systems with collaborative intelligence. However, in heterogeneous multi-agent systems (HMAS), capability differences among agents give rise to consistent cognitive problems, where strong and weak models fail to contribute effectively. We define the collaboration as a strong-weak system. Through comprehensive experiments, we disclose a counterintuitive phenomenon in the strong-weak system: a strong-weak collaboration may under-perform weak-weak combinations, revealing that cognitive mismatching are key bottlenecks limiting heterogeneous cooperation. To overcome these challenges, we propose an Entropy-Based Adaptive Guidance Framework that dynamically aligns the guidance with the cognitive state of each agent. The framework quantifies the understanding of weak agents through multi-dimensional entropy metrics - covering expression, uncertainty, structure, coherence, and relevance - and adaptively adjusts the intensity of the guidance at light, moderate and intensive levels. Furthermore, a Retrieval-Augmented Generation (RAG) mechanism is incorporated to retain successful collaboration experiences, enabling both immediate adaptation and long-term learning. Extensive experiments on three benchmark datasets, GSM8K, MBPP, and CVRP demonstrate that our approach consistently enhances the effectiveness and stability of heterogeneous collaboration. The results highlight that adaptive guidance not only mitigates cognitive imbalance but also establishes a scalable pathway toward more robust, cooperative multi-agent intelligence.