EvoRoute: Experience-Driven Self-Routing LLM Agent Systems

📅 2026-01-06

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

This work addresses the trilemma faced by large language model (LLM) agents in complex multi-turn tasks—balancing performance, cost, and latency—by proposing an experience-driven self-evolving routing mechanism. The approach dynamically selects the Pareto-optimal model at each step and continuously refines its routing strategy through feedback from historical executions, thereby integrating dynamic model routing with continual experiential learning for the first time. The system incorporates an experience knowledge base, Pareto-optimal model selection, and environment-feedback-driven policy updates. Evaluated on benchmarks such as GAIA and BrowseComp+, it achieves up to 80% cost reduction and over 70% latency decrease while maintaining or even improving task performance.

Technology Category

Application Category

📝 Abstract

Complex agentic AI systems, powered by a coordinated ensemble of Large Language Models (LLMs), tool and memory modules, have demonstrated remarkable capabilities on intricate, multi-turn tasks. However, this success is shadowed by prohibitive economic costs and severe latency, exposing a critical, yet underexplored, trade-off. We formalize this challenge as the \textbf{Agent System Trilemma}: the inherent tension among achieving state-of-the-art performance, minimizing monetary cost, and ensuring rapid task completion. To dismantle this trilemma, we introduce EvoRoute, a self-evolving model routing paradigm that transcends static, pre-defined model assignments. Leveraging an ever-expanding knowledge base of prior experience, EvoRoute dynamically selects Pareto-optimal LLM backbones at each step, balancing accuracy, efficiency, and resource use, while continually refining its own selection policy through environment feedback. Experiments on challenging agentic benchmarks such as GAIA and BrowseComp+ demonstrate that EvoRoute, when integrated into off-the-shelf agentic systems, not only sustains or enhances system performance but also reduces execution cost by up to $80\%$ and latency by over $70\%$.

Problem

Research questions and friction points this paper is trying to address.

Agent System Trilemma

Large Language Models

cost-latency trade-off

multi-turn tasks

agentic AI systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

EvoRoute

model routing

agent system trilemma