🤖 AI Summary
Selecting optimal pre-trained source models for few-shot target tasks remains computationally prohibitive due to the need for costly fine-tuning-based evaluation across large model repositories.
Method: We propose BeST, a training-free task similarity metric that uniquely integrates quantization-level parameter sensitivity analysis with a class-wise early-stopping mechanism within a mapping approximation function, enabling rapid and accurate estimation of source-to-target transferability without any adaptation.
Contribution/Results: By unifying task-level similarity modeling with deep transfer learning frameworks, BeST achieves state-of-the-art performance across multiple benchmarks and few-shot settings: it reduces source model evaluation overhead by over 90% on average while improving downstream accuracy by 2.1–5.7 percentage points. This work establishes the first high-accuracy, zero-cost paradigm for source model selection in few-shot transfer learning.
📝 Abstract
One of the most fundamental, and yet relatively less explored, goals in transfer learning is the efficient means of selecting top candidates from a large number of previously trained models (optimized for various"source"tasks) that would perform the best for a new"target"task with a limited amount of data. In this paper, we undertake this goal by developing a novel task-similarity metric (BeST) and an associated method that consistently performs well in identifying the most transferrable source(s) for a given task. In particular, our design employs an innovative quantization-level optimization procedure in the context of classification tasks that yields a measure of similarity between a source model and the given target data. The procedure uses a concept similar to early stopping (usually implemented to train deep neural networks (DNNs) to ensure generalization) to derive a function that approximates the transfer learning mapping without training. The advantage of our metric is that it can be quickly computed to identify the top candidate(s) for a given target task before a computationally intensive transfer operation (typically using DNNs) can be implemented between the selected source and the target task. As such, our metric can provide significant computational savings for transfer learning from a selection of a large number of possible source models. Through extensive experimental evaluations, we establish that our metric performs well over different datasets and varying numbers of data samples.