Occam's model: Selecting simpler representations for better transferability estimation

📅 2025-02-10

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

This paper addresses the challenge of evaluating pre-trained model transferability to downstream tasks without fine-tuning. We propose a lightweight, forward-pass-only evaluation paradigm that quantifies transferability via the linear separability of representations for target classes and their intrinsic learnability. Grounded in Occam’s razor, our approach integrates representation complexity analysis, linear separability criteria, and information bottleneck theory to derive two theoretically interpretable, computationally efficient metrics. Crucially, the method avoids gradient computation and parameter updates, enhancing robustness and generalization across diverse settings. Extensive experiments on multi-task and multi-architecture benchmarks demonstrate consistent superiority over existing training-free evaluation methods: average Kendall’s Tau correlation improves by 24%, with a maximum gain of 32%.

Technology Category

Application Category

📝 Abstract

Fine-tuning models that have been pre-trained on large datasets has become a cornerstone of modern machine learning workflows. With the widespread availability of online model repositories, such as Hugging Face, it is now easier than ever to fine-tune pre-trained models for specific tasks. This raises a critical question: which pre-trained model is most suitable for a given task? This problem is called transferability estimation. In this work, we introduce two novel and effective metrics for estimating the transferability of pre-trained models. Our approach is grounded in viewing transferability as a measure of how easily a pre-trained model's representations can be trained to separate target classes, providing a unique perspective on transferability estimation. We rigorously evaluate the proposed metrics against state-of-the-art alternatives across diverse problem settings, demonstrating their robustness and practical utility. Additionally, we present theoretical insights that explain our metrics' efficacy and adaptability to various scenarios. We experimentally show that our metrics increase Kendall's Tau by up to 32% compared to the state-of-the-art baselines.

Problem

Research questions and friction points this paper is trying to address.

Estimating pre-trained model transferability

Selecting models for specific tasks

Improving transferability estimation metrics

Innovation

Methods, ideas, or system contributions that make the work stand out.

Simpler representations for transferability

Novel metrics for model suitability

Theoretical insights enhance metric efficacy

🔎 Similar Papers

Features are fate: a theory of transfer learning in high-dimensional regression