SkillOrchestra: Learning to Route Agents via Skill Transfer

📅 2026-02-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing AI orchestration methods are limited by coarse query-level routing, high training costs of reinforcement learning (RL), and the risk of routing collapse. This work proposes SkillOrchestra, a framework that explicitly models fine-grained skills and learns each agent’s proficiency and execution cost across these skills. By dynamically selecting the optimal agent based on the current interaction’s skill requirements and a performance–cost trade-off, SkillOrchestra achieves interpretable, scalable, and sample-efficient orchestration while avoiding the instability of end-to-end RL. Experimental results demonstrate that SkillOrchestra outperforms the best RL-based orchestrator by up to 22.5% across ten benchmarks, with training costs reduced by 700× and 300× compared to Router-R1 and ToolOrchestra, respectively.

Technology Category

Application Category

📝 Abstract
Compound AI systems promise capabilities beyond those of individual models, yet their success depends critically on effective orchestration. Existing routing approaches face two limitations: (1) input-level routers make coarse query-level decisions that ignore evolving task requirements; (2) RL-trained orchestrators are expensive to adapt and often suffer from routing collapse, repeatedly invoking one strong but costly option in multi-turn scenarios. We introduce SkillOrchestra, a framework for skill-aware orchestration. Instead of directly learning a routing policy end-to-end, SkillOrchestra learns fine-grained skills from execution experience and models agent-specific competence and cost under those skills. At deployment, the orchestrator infers the skill demands of the current interaction and selects agents that best satisfy them under an explicit performance-cost trade-off. Extensive experiments across ten benchmarks demonstrate that SkillOrchestra outperforms SoTA RL-based orchestrators by up to 22.5% with 700x and 300x learning cost reduction compared to Router-R1 and ToolOrchestra, respectively. These results show that explicit skill modeling enables scalable, interpretable, and sample-efficient orchestration, offering a principled alternative to data-intensive RL-based approaches. The code is available at: https://github.com/jiayuww/SkillOrchestra.
Problem

Research questions and friction points this paper is trying to address.

agent routing
skill transfer
compound AI systems
routing collapse
orchestration
Innovation

Methods, ideas, or system contributions that make the work stand out.

skill-aware orchestration
agent routing
compound AI systems
performance-cost trade-off
sample-efficient learning
🔎 Similar Papers
No similar papers found.