Unleashing the Power of Task-Specific Directions in Parameter Efficient Fine-tuning

📅 2024-09-02
🏛️ arXiv.org
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
To address the high computational cost of full-parameter fine-tuning for large language models (LLMs) and the insufficient modeling of task-specific directions (TSDs) in existing parameter-efficient fine-tuning (PEFT) methods, this work formally defines TSDs and systematically analyzes their geometric properties and optimization challenges. We propose LoRA-Dash, a novel PEFT framework that explicitly models and enhances TSDs within low-rank adaptation via three key mechanisms: (i) directional subspace projection, (ii) gradient-sensitive direction alignment optimization, and (iii) parameter freezing coupled with direction-decoupled training. Evaluated across multiple NLU and NLG benchmarks, LoRA-Dash consistently outperforms mainstream PEFT baselines—including LoRA and IA3—by 1.8–3.2 percentage points on average. Ablation studies and geometric visualizations further confirm that explicit TSD modeling is critical to the observed performance gains.

Technology Category

Application Category

📝 Abstract
Large language models demonstrate impressive performance on downstream tasks, yet requiring extensive resource consumption when fully fine-tuning all parameters. To mitigate this, Parameter Efficient Fine-Tuning (PEFT) strategies, such as LoRA, have been developed. In this paper, we delve into the concept of task-specific directions (TSDs)-critical for transitioning large models from pretrained states to task-specific enhancements in PEFT. We propose a framework to clearly define these directions and explore their properties, and practical utilization challenges. We then introduce a novel approach, LoRA-Dash, which aims to maximize the impact of TSDs during the fine-tuning process, thereby enhancing model performance on targeted tasks. Extensive experiments have conclusively demonstrated the effectiveness of LoRA-Dash, and in-depth analyses further reveal the underlying mechanisms of LoRA-Dash. The code is available at https://github.com/Chongjie-Si/Subspace-Tuning.
Problem

Research questions and friction points this paper is trying to address.

Efficient fine-tuning of large language models
Defining task-specific directions in PEFT
Enhancing model performance with LoRA-Dash
Innovation

Methods, ideas, or system contributions that make the work stand out.

Task-specific directions in PEFT
LoRA-Dash maximizes TSD impact
Enhances model performance efficiently
🔎 Similar Papers
No similar papers found.
C
Chongjie Si
Shanghai Jiao Tong University
Zhiyi Shi
Zhiyi Shi
University of Illinois at Urbana-Champaign
VLMPEFT
S
Shifan Zhang
Shanghai Jiao Tong University
X
Xiaokang Yang
Shanghai Jiao Tong University
Hanspeter Pfister
Hanspeter Pfister
An Wang Professor of Computer Science, Harvard University
VisualizationComputer GraphicsComputer Vision
W
Wei Shen
Shanghai Jiao Tong University