🤖 AI Summary
Existing model routing methods suffer from poor scalability and difficulty adapting to rapidly expanding large language model (LLM) ecosystems. To address this, we propose TagRouter—a training-free, lightweight semantic routing method for open-domain text generation. TagRouter automatically extracts multi-granular semantic labels to represent queries and performs zero-shot evaluation of LLM suitability, dynamically routing each query to the optimal model. Its core innovation is the first label-based, training-agnostic routing mechanism, enabling plug-and-play integration and continuous evolution toward a scalable “super-model” architecture. Experiments across multiple benchmarks demonstrate that TagRouter significantly outperforms 13 baseline methods: system acceptance rate improves by 6.15%, inference cost decreases by 17.20%, and it achieves high accuracy, low computational overhead, and strong scalability.
📝 Abstract
Model routing allocates queries to the suitable model, improving system performance while reducing costs. However, existing routing methods face practical limitations that hinder scalability in large-scale applications and struggle to keep up with the rapid growth of the large language model (LLM) ecosystem. To tackle these challenges, we propose TagRouter, a training-free model routing method designed to optimize the synergy among multiple LLMs for open-domain text generation tasks. Experimental results demonstrate that TagRouter outperforms 13 baseline methods, increasing the accept rate of system by 6.15% and reducing costs by 17.20%, achieving optimal cost-efficiency. Our findings provides the LLM community with an efficient and scalable solution for model ensembling, offering users an evolvable"super model."