Occam's model: Selecting simpler representations for better transferability estimation

๐Ÿ“… 2025-02-10
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This paper addresses the challenge of evaluating pre-trained model transferability to downstream tasks without fine-tuning. We propose a lightweight, forward-pass-only evaluation paradigm that quantifies transferability via the linear separability of representations for target classes and their intrinsic learnability. Grounded in Occamโ€™s razor, our approach integrates representation complexity analysis, linear separability criteria, and information bottleneck theory to derive two theoretically interpretable, computationally efficient metrics. Crucially, the method avoids gradient computation and parameter updates, enhancing robustness and generalization across diverse settings. Extensive experiments on multi-task and multi-architecture benchmarks demonstrate consistent superiority over existing training-free evaluation methods: average Kendallโ€™s Tau correlation improves by 24%, with a maximum gain of 32%.

Technology Category

Application Category

๐Ÿ“ Abstract
Fine-tuning models that have been pre-trained on large datasets has become a cornerstone of modern machine learning workflows. With the widespread availability of online model repositories, such as Hugging Face, it is now easier than ever to fine-tune pre-trained models for specific tasks. This raises a critical question: which pre-trained model is most suitable for a given task? This problem is called transferability estimation. In this work, we introduce two novel and effective metrics for estimating the transferability of pre-trained models. Our approach is grounded in viewing transferability as a measure of how easily a pre-trained model's representations can be trained to separate target classes, providing a unique perspective on transferability estimation. We rigorously evaluate the proposed metrics against state-of-the-art alternatives across diverse problem settings, demonstrating their robustness and practical utility. Additionally, we present theoretical insights that explain our metrics' efficacy and adaptability to various scenarios. We experimentally show that our metrics increase Kendall's Tau by up to 32% compared to the state-of-the-art baselines.
Problem

Research questions and friction points this paper is trying to address.

Estimating pre-trained model transferability
Selecting models for specific tasks
Improving transferability estimation metrics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Simpler representations for transferability
Novel metrics for model suitability
Theoretical insights enhance metric efficacy
๐Ÿ”Ž Similar Papers
No similar papers found.