Unleashing the Power of Task-Specific Directions in Parameter Efficient Fine-tuning

๐Ÿ“… 2024-09-02
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 2
โœจ Influential: 0
๐Ÿ“„ PDF

career value

198K/year
๐Ÿค– AI Summary
To address the high computational cost of full-parameter fine-tuning for large language models (LLMs) and the insufficient modeling of task-specific directions (TSDs) in existing parameter-efficient fine-tuning (PEFT) methods, this work formally defines TSDs and systematically analyzes their geometric properties and optimization challenges. We propose LoRA-Dash, a novel PEFT framework that explicitly models and enhances TSDs within low-rank adaptation via three key mechanisms: (i) directional subspace projection, (ii) gradient-sensitive direction alignment optimization, and (iii) parameter freezing coupled with direction-decoupled training. Evaluated across multiple NLU and NLG benchmarks, LoRA-Dash consistently outperforms mainstream PEFT baselinesโ€”including LoRA and IA3โ€”by 1.8โ€“3.2 percentage points on average. Ablation studies and geometric visualizations further confirm that explicit TSD modeling is critical to the observed performance gains.

Technology Category

Application Category

๐Ÿ“ Abstract
Large language models demonstrate impressive performance on downstream tasks, yet requiring extensive resource consumption when fully fine-tuning all parameters. To mitigate this, Parameter Efficient Fine-Tuning (PEFT) strategies, such as LoRA, have been developed. In this paper, we delve into the concept of task-specific directions (TSDs)-critical for transitioning large models from pretrained states to task-specific enhancements in PEFT. We propose a framework to clearly define these directions and explore their properties, and practical utilization challenges. We then introduce a novel approach, LoRA-Dash, which aims to maximize the impact of TSDs during the fine-tuning process, thereby enhancing model performance on targeted tasks. Extensive experiments have conclusively demonstrated the effectiveness of LoRA-Dash, and in-depth analyses further reveal the underlying mechanisms of LoRA-Dash. The code is available at https://github.com/Chongjie-Si/Subspace-Tuning.
Problem

Research questions and friction points this paper is trying to address.

Efficient fine-tuning of large language models
Defining task-specific directions in PEFT
Enhancing model performance with LoRA-Dash
Innovation

Methods, ideas, or system contributions that make the work stand out.

Task-specific directions in PEFT
LoRA-Dash maximizes TSD impact
Enhances model performance efficiently
๐Ÿ”Ž Similar Papers
No similar papers found.