🤖 AI Summary
This work addresses the core challenge in parameter-efficient fine-tuning (PEFT): automatically identifying a minimal yet highly influential subset of trainable parameters. We formulate subset selection as a multi-objective optimization problem for the first time, transforming it—via ε-constraint scalarization and second-order Taylor approximation—into a 0–1 knapsack problem, solved via Pareto-optimal search. Furthermore, we propose a Hessian-guided adaptive selection mechanism enabling cross-task and cross-scale transferability. Experiments demonstrate that AdaPEFT achieves performance on par with or superior to state-of-the-art PEFT methods across diverse NLP and vision benchmarks, using fewer than 0.1% trainable parameters. Crucially, the selected parameter subsets exhibit strong transferability across training steps and model sizes, confirming robust generalization.
📝 Abstract
Parameter-efficient fine-tuning (PEFT) is a highly effective approach for adapting large pre-trained models to downstream tasks with minimal computational overhead. At the core, PEFT methods freeze most parameters and only trains a small subset (say $<0.1%$ of total parameters). Notably, different PEFT methods select different subsets, resulting in varying levels of performance. This variation prompts a key question: how to effectively select the most influential subset to train? We formulate the subset selection as a multi-task problem: maximizing the performance and minimizing the number of trainable parameters. We leverage a series of transformations -- including $epsilon$-constraint method and second-order Taylor approximation -- to arrive at the classical 0-1 knapsack problem, which we solve through the lens of Pareto optimality. Consequently, we propose AdaPEFT, a Hessian-informed PEFT that adapts to various tasks and models, in which the selected subset empirically transfers across training horizons and model sizes.