🤖 AI Summary
This work addresses the challenge posed by the rapidly expanding landscape of AI models by formally introducing the problem of continual model routing (CMR), which aims to dynamically select among thousands of expert models while adapting to the continuous influx of new models and tasks. To this end, the authors construct CMRBench, a large-scale dynamic benchmark comprising over 2,000 models, and propose CARvE—a contrastive embedding-based routing mechanism that integrates checkpoint anchoring with structured replay. Experimental results demonstrate that CARvE significantly outperforms baseline approaches—including zero-shot retrieval, fine-tuning, and adapter fusion—across model-level, family-level, and domain-level routing accuracy.
📝 Abstract
AI model hubs provide access to a rapidly growing collection of powerful pre-trained models, enabling off-the-shelf mixture-of-experts systems with different routing strategies. However, this rapid growth poses two fundamental challenges: scaling model selection across thousands of experts and continually updating routing mechanisms as new models and tasks are introduced. In this paper, we formalise this setting as Continual Model Routing (CMR) and propose CMRBench, a new large-scale benchmark simulating realistic hub expansion and including over 2,000 candidate models. Finally, we introduce CARvE, a contrastive embedding approach for efficient continual model routing via checkpoint-based anchoring and structured replay. Extensive empirical results and ablations show that CARvE significantly outperforms zero-shot retrieval, fine-tuning, and adapter-merging baselines in model, family, and domain-level accuracy.