Model Diffusion for Certifiable Few-shot Transfer Learning

📅 2025-02-10

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

Parameter-efficient fine-tuning (PEFT) in few-shot transfer learning lacks verifiable generalization guarantees. Method: We propose the first provably transferable learning framework with a non-vacuous generalization error upper bound. To overcome the vacuity of risk bounds arising from continuous parameter spaces, we introduce diffusion models to approximate the posterior distribution over PEFT parameters and sample a finite set of candidate models for downstream tasks. Contribution/Results: Leveraging learning theory and Bayesian model selection, we derive a tight, computationally tractable generalization bound. Empirical evaluation demonstrates that our bound is significantly tighter than existing vacuous bounds in few-shot settings. This work establishes the first theoretically rigorous and practically applicable generalization guarantee for PEFT, advancing reliable AI deployment.

Technology Category

Application Category

📝 Abstract

In modern large-scale deep learning, a prevalent and effective workflow for solving low-data problems is adapting powerful pre-trained foundation models (FMs) to new tasks via parameter-efficient fine-tuning (PEFT). However, while empirically effective, the resulting solutions lack generalisation guarantees to certify their accuracy - which may be required for ethical or legal reasons prior to deployment in high-importance applications. In this paper we develop a novel transfer learning approach that is designed to facilitate non-vacuous learning theoretic generalisation guarantees for downstream tasks, even in the low-shot regime. Specifically, we first use upstream tasks to train a distribution over PEFT parameters. We then learn the downstream task by a sample-and-evaluate procedure -- sampling plausible PEFTs from the trained diffusion model and selecting the one with the highest likelihood on the downstream data. Crucially, this confines our model hypothesis to a finite set of PEFT samples. In contrast to learning in the typical continuous hypothesis spaces of neural network weights, this facilitates tighter risk certificates. We instantiate our bound and show non-trivial generalization guarantees compared to existing learning approaches which lead to vacuous bounds in the low-shot regime.

Problem

Research questions and friction points this paper is trying to address.

few-shot transfer learning

generalization guarantees

low-data problems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Parameter-efficient fine-tuning

Diffusion model sampling

Finite hypothesis space

🔎 Similar Papers

No similar papers found.