Unleashing the Power of Task-Specific Directions in Parameter Efficient Fine-tuning

📅 2024-09-02

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

165K/year

🤖 AI Summary

To address the high computational cost of full-parameter fine-tuning for large language models (LLMs) and the insufficient modeling of task-specific directions (TSDs) in existing parameter-efficient fine-tuning (PEFT) methods, this work formally defines TSDs and systematically analyzes their geometric properties and optimization challenges. We propose LoRA-Dash, a novel PEFT framework that explicitly models and enhances TSDs within low-rank adaptation via three key mechanisms: (i) directional subspace projection, (ii) gradient-sensitive direction alignment optimization, and (iii) parameter freezing coupled with direction-decoupled training. Evaluated across multiple NLU and NLG benchmarks, LoRA-Dash consistently outperforms mainstream PEFT baselines—including LoRA and IA3—by 1.8–3.2 percentage points on average. Ablation studies and geometric visualizations further confirm that explicit TSD modeling is critical to the observed performance gains.

Technology Category

Application Category

📝 Abstract

Large language models demonstrate impressive performance on downstream tasks, yet requiring extensive resource consumption when fully fine-tuning all parameters. To mitigate this, Parameter Efficient Fine-Tuning (PEFT) strategies, such as LoRA, have been developed. In this paper, we delve into the concept of task-specific directions (TSDs)-critical for transitioning large models from pretrained states to task-specific enhancements in PEFT. We propose a framework to clearly define these directions and explore their properties, and practical utilization challenges. We then introduce a novel approach, LoRA-Dash, which aims to maximize the impact of TSDs during the fine-tuning process, thereby enhancing model performance on targeted tasks. Extensive experiments have conclusively demonstrated the effectiveness of LoRA-Dash, and in-depth analyses further reveal the underlying mechanisms of LoRA-Dash. The code is available at https://github.com/Chongjie-Si/Subspace-Tuning.

Problem

Research questions and friction points this paper is trying to address.

Efficient fine-tuning of large language models

Defining task-specific directions in PEFT

Enhancing model performance with LoRA-Dash

Innovation

Methods, ideas, or system contributions that make the work stand out.

Task-specific directions in PEFT

LoRA-Dash maximizes TSD impact

Enhances model performance efficiently

🔎 Similar Papers

No similar papers found.