🤖 AI Summary
Expert models struggle with zero-shot adaptation to novel tasks while simultaneously maintaining high performance and broad generalization. Method: This paper proposes Model Label Learning (MLL), a new paradigm that constructs a Semantic Directed Acyclic Graph (SDAG) to explicitly model functional semantics of models, enabling interpretable alignment between model capabilities and task requirements; it further introduces the Classification Head Combination Optimization (CHCO) algorithm for zero-shot, fine-tuning-free model selection. The resulting model hub enables plug-and-play reuse of expert models via label-driven orchestration. Contribution/Results: Experiments across seven real-world datasets demonstrate significant gains in zero-shot accuracy—especially for small models—and show consistent improvement in zero-shot performance as the hub scales. For the first time, MLL unifies strong generalization, scalability, and retention of expert-model accuracy.
📝 Abstract
Vision-language models (VLMs) like CLIP have demonstrated impressive zero-shot ability in image classification tasks by aligning text and images but suffer inferior performance compared with task-specific expert models. On the contrary, expert models excel in their specialized domains but lack zero-shot ability for new tasks. How to obtain both the high performance of expert models and zero-shot ability is an important research direction. In this paper, we attempt to demonstrate that by constructing a model hub and aligning models with their functionalities using model labels, new tasks can be solved in a zero-shot manner by effectively selecting and reusing models in the hub. We introduce a novel paradigm, Model Label Learning (MLL), which bridges the gap between models and their functionalities through a Semantic Directed Acyclic Graph (SDAG) and leverages an algorithm, Classification Head Combination Optimization (CHCO), to select capable models for new tasks. Compared with the foundation model paradigm, it is less costly and more scalable, i.e., the zero-shot ability grows with the sizes of the model hub. Experiments on seven real-world datasets validate the effectiveness and efficiency of MLL, demonstrating that expert models can be effectively reused for zero-shot tasks. Our code will be released publicly.