🤖 AI Summary
Existing AI orchestration methods are limited by coarse query-level routing, high training costs of reinforcement learning (RL), and the risk of routing collapse. This work proposes SkillOrchestra, a framework that explicitly models fine-grained skills and learns each agent’s proficiency and execution cost across these skills. By dynamically selecting the optimal agent based on the current interaction’s skill requirements and a performance–cost trade-off, SkillOrchestra achieves interpretable, scalable, and sample-efficient orchestration while avoiding the instability of end-to-end RL. Experimental results demonstrate that SkillOrchestra outperforms the best RL-based orchestrator by up to 22.5% across ten benchmarks, with training costs reduced by 700× and 300× compared to Router-R1 and ToolOrchestra, respectively.
📝 Abstract
Compound AI systems promise capabilities beyond those of individual models, yet their success depends critically on effective orchestration. Existing routing approaches face two limitations: (1) input-level routers make coarse query-level decisions that ignore evolving task requirements; (2) RL-trained orchestrators are expensive to adapt and often suffer from routing collapse, repeatedly invoking one strong but costly option in multi-turn scenarios. We introduce SkillOrchestra, a framework for skill-aware orchestration. Instead of directly learning a routing policy end-to-end, SkillOrchestra learns fine-grained skills from execution experience and models agent-specific competence and cost under those skills. At deployment, the orchestrator infers the skill demands of the current interaction and selects agents that best satisfy them under an explicit performance-cost trade-off. Extensive experiments across ten benchmarks demonstrate that SkillOrchestra outperforms SoTA RL-based orchestrators by up to 22.5% with 700x and 300x learning cost reduction compared to Router-R1 and ToolOrchestra, respectively. These results show that explicit skill modeling enables scalable, interpretable, and sample-efficient orchestration, offering a principled alternative to data-intensive RL-based approaches. The code is available at: https://github.com/jiayuww/SkillOrchestra.