🤖 AI Summary
To address the limitations of fixed scheduling and inefficient coordination in multi-agent systems when adapting to dynamic task requirements, this paper proposes the State-Aware Routing (SAR) framework. SAR decouples interaction history encoding from agent knowledge modeling, enabling adaptive, step-wise selection of the optimal single agent for each decision. It introduces a novel state-aware router and integrates a self-evolving data generation mechanism to efficiently construct high-quality execution-path training datasets. Evaluated on challenging collaborative reasoning benchmarks, SAR achieves a 23.8% performance improvement over baseline methods while reducing data collection overhead by 90.1% compared to exhaustive search. This work significantly enhances the generalization capability and coordination efficiency of LLM-driven multi-agent systems, establishing a scalable new paradigm for agent collaboration in dynamic task environments.
📝 Abstract
The emergence of multi-agent systems powered by large language models (LLMs) has unlocked new frontiers in complex task-solving, enabling diverse agents to integrate unique expertise, collaborate flexibly, and address challenges unattainable for individual models. However, the full potential of such systems is hindered by rigid agent scheduling and inefficient coordination strategies that fail to adapt to evolving task requirements. In this paper, we propose STRMAC, a state-aware routing framework designed for efficient collaboration in multi-agent systems. Our method separately encodes interaction history and agent knowledge to power the router, which adaptively selects the most suitable single agent at each step for efficient and effective collaboration. Furthermore, we introduce a self-evolving data generation approach that accelerates the collection of high-quality execution paths for efficient system training. Experiments on challenging collaborative reasoning benchmarks demonstrate that our method achieves state-of-the-art performance, achieving up to 23.8% improvement over baselines and reducing data collection overhead by up to 90.1% compared to exhaustive search.