🤖 AI Summary
Existing multi-agent systems typically optimize only a single dimension—such as communication topology, role assignment, or model routing—and treat agents as indivisible units, overlooking the potential of integrating multiple large language models (LLMs) within individual agents. This work proposes HieraMAS, a hierarchical collaboration framework that simultaneously optimizes both intra-node LLM ensembling and inter-node communication topology. By employing a hypernode architecture, each role is instantiated through a proposal-synthesis mechanism involving multiple heterogeneous LLMs. A multi-level reward attribution scheme, coupled with graph-classification-driven topology selection, effectively mitigates credit assignment bias arising from disparities in model capabilities. Experiments demonstrate that HieraMAS significantly outperforms existing approaches on reasoning and code generation tasks while achieving a superior cost-performance trade-off.
📝 Abstract
Multi-agent systems (MAS) built on large language models (LLMs) have shown strong performance across many tasks. Most existing approaches improve only one aspect at a time, such as the communication topology, role assignment, or LLM routing, while treating each agent as a single, indivisible unit. This misses the opportunity to use mixtures of LLMs within an agent to strengthen role-specific abilities. We propose HieraMAS, a hierarchical collaboration framework that combines intra-node LLM mixtures with an inter-node communication topology. HieraMAS introduces supernodes, where each functional role is implemented by multiple heterogeneous LLMs using a propose-synthesis structure. Optimizing HieraMAS creates unique credit-assignment challenges: final task performance depends heavily on the underlying LLMs'capabilities, which can lead reinforcement methods to incorrectly reward suboptimal configurations. To address this, we use a two-stage algorithm: (1) multi-level reward attribution, which provides fine-grained feedback at both the node level and the overall system level; (2) graph classification for topology selection, which treats choosing the communication structure as a holistic decision rather than optimizing edges one by one. Experiments on reasoning and coding benchmarks show that HieraMAS substantially outperforms existing methods while also delivering better cost-performance trade-offs.