Adaptive parameter-efficient fine-tuning via Hessian-informed subset selection

📅 2025-05-18

📈 Citations: 0

✨ Influential: 0

career value

232K/year

🤖 AI Summary

This work addresses the core challenge in parameter-efficient fine-tuning (PEFT): automatically identifying a minimal yet highly influential subset of trainable parameters. We formulate subset selection as a multi-objective optimization problem for the first time, transforming it—via ε-constraint scalarization and second-order Taylor approximation—into a 0–1 knapsack problem, solved via Pareto-optimal search. Furthermore, we propose a Hessian-guided adaptive selection mechanism enabling cross-task and cross-scale transferability. Experiments demonstrate that AdaPEFT achieves performance on par with or superior to state-of-the-art PEFT methods across diverse NLP and vision benchmarks, using fewer than 0.1% trainable parameters. Crucially, the selected parameter subsets exhibit strong transferability across training steps and model sizes, confirming robust generalization.

Technology Category

Application Category

📝 Abstract

Parameter-efficient fine-tuning (PEFT) is a highly effective approach for adapting large pre-trained models to downstream tasks with minimal computational overhead. At the core, PEFT methods freeze most parameters and only trains a small subset (say $<0.1%$ of total parameters). Notably, different PEFT methods select different subsets, resulting in varying levels of performance. This variation prompts a key question: how to effectively select the most influential subset to train? We formulate the subset selection as a multi-task problem: maximizing the performance and minimizing the number of trainable parameters. We leverage a series of transformations -- including $epsilon$-constraint method and second-order Taylor approximation -- to arrive at the classical 0-1 knapsack problem, which we solve through the lens of Pareto optimality. Consequently, we propose AdaPEFT, a Hessian-informed PEFT that adapts to various tasks and models, in which the selected subset empirically transfers across training horizons and model sizes.

Problem

Research questions and friction points this paper is trying to address.

How to select the most influential subset for parameter-efficient fine-tuning

Maximizing performance while minimizing trainable parameters in PEFT

Adapting Hessian-informed subset selection across tasks and models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hessian-informed subset selection for PEFT

Multi-task optimization via Pareto optimality

Adaptive AdaPEFT for diverse tasks

🔎 Similar Papers

Step-by-Step Unmasking for Parameter-Efficient Fine-tuning of Large Language Models