MonoScale: Scaling Multi-Agent System with Monotonic Improvement

📅 2026-01-30

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This work addresses the performance degradation commonly observed when naively scaling agent pools in large language model–based multi-agent systems, where routing modules suffer from cold-start issues with newly introduced heterogeneous and unreliable agents. To mitigate this, the authors propose an expansion-aware update framework that formulates sequential pool augmentation as a contextual bandit problem. The approach generates agent-conditioned familiarization tasks to collect evidence from successful and failed interactions, which is then distilled into auditable natural language memories to inform routing decisions. Coupled with a trust-region memory update mechanism, this framework provides the first formal guarantee of monotonic non-decreasing performance during pool expansion. Experiments on the GAIA and Humanity's Last Exam benchmarks demonstrate consistent performance gains as the agent pool scales, significantly outperforming both naive expansion strategies and strong fixed-pool routing baselines.

Technology Category

Application Category

📝 Abstract

In recent years, LLM-based multi-agent systems (MAS) have advanced rapidly, using a router to decompose tasks and delegate subtasks to specialized agents. A natural way to expand capability is to scale up the agent pool by continually integrating new functional agents or tool interfaces, but naive expansion can trigger performance collapse when the router cold-starts on newly added, heterogeneous, and unreliable agents. We propose MonoScale, an expansion-aware update framework that proactively generates a small set of agent-conditioned familiarization tasks, harvests evidence from both successful and failed interactions, and distills it into auditable natural-language memory to guide future routing. We formalize sequential augmentation as a contextual bandit and perform trust-region memory updates, yielding a monotonic non-decreasing performance guarantee across onboarding rounds. Experiments on GAIA and Humanity's Last Exam show stable gains as the agent pool grows, outperforming naive scale-up and strong-router fixed-pool baselines.

Problem

Research questions and friction points this paper is trying to address.

multi-agent system

agent scaling

router cold-start

performance collapse

heterogeneous agents

Innovation

Methods, ideas, or system contributions that make the work stand out.

MonoScale

multi-agent system

monotonic improvement