🤖 AI Summary
Under fixed inference budgets, multi-agent systems often suffer from performance saturation or even collapse due to limited context, lossy communication, and ineffective sharing. This work proposes a calibratable theoretical framework that integrates the compute scaling exponent β, a message fidelity function γ(m), shared error correlation ρ, and a tree-structured hierarchy. By combining majority-vote aggregation with continuous performance modeling, we uncover—for the first time—a sharp performance phase transition in such systems. We derive closed-form conditions for emergent synergy and identify a critical budget threshold, define an organizational index s alongside a compute allocation rule, and precisely delineate the boundary of efficient collaboration. Our framework successfully explains key bottlenecks observed in recent large-model agent scaling experiments.
📝 Abstract
Multi-agent systems can improve reliability, yet under a fixed inference budget they often help, saturate, or even collapse. We develop a minimal and calibratable theory that predicts these regimes from three binding constraints of modern agent stacks: finite context windows, lossy inter-agent communication, and shared failures among similar agents. Each leaf agent is summarized by a compute-performance scaling exponent $\beta$; communication is captured by a message-length fidelity curve $\gamma(m)$; dependence is captured by an effective shared-error correlation $\rho$; and a context window $W$ imposes hard fan-in limits that make hierarchy necessary. For binary success/failure tasks with majority aggregation, we prove a sharp phase transition for deep $b$-ary trees with correlated inputs and lossy communication: a single scalar $\alpha_\rho$ (combining $\gamma(m)$, $\rho$, and fan-in $b$) determines whether weak signal is amplified to a nontrivial fixed point or washed out to chance. In the amplifying regime, we derive an organization exponent $s$ and show that budgeted synergy, i.e., outperforming the best single agent under the same total budget, occurs exactly when $s>\beta$, yielding closed-form compute allocation rules and explicit budget thresholds. We further characterize saturation via a mixing depth and provide a conservative clipped predictor that remains accurate across growth and saturation. A continuous-performance warm-up gives closed-form risks for star, chain, and tree organizations, making correlation- and communication-induced floors explicit and exposing the core design trade-offs in a smooth setting. Finally, we validate the predicted phase boundaries in controlled synthetic simulations and show how the same mechanisms explain the dominant bottlenecks reported in recent large-scale matched-budget studies of LLM agent-system scaling.