🤖 AI Summary
To address the high computational cost of full-parameter fine-tuning for large language models (LLMs) and the insufficient modeling of task-specific directions (TSDs) in existing parameter-efficient fine-tuning (PEFT) methods, this work formally defines TSDs and systematically analyzes their geometric properties and optimization challenges. We propose LoRA-Dash, a novel PEFT framework that explicitly models and enhances TSDs within low-rank adaptation via three key mechanisms: (i) directional subspace projection, (ii) gradient-sensitive direction alignment optimization, and (iii) parameter freezing coupled with direction-decoupled training. Evaluated across multiple NLU and NLG benchmarks, LoRA-Dash consistently outperforms mainstream PEFT baselines—including LoRA and IA3—by 1.8–3.2 percentage points on average. Ablation studies and geometric visualizations further confirm that explicit TSD modeling is critical to the observed performance gains.
📝 Abstract
Large language models demonstrate impressive performance on downstream tasks, yet requiring extensive resource consumption when fully fine-tuning all parameters. To mitigate this, Parameter Efficient Fine-Tuning (PEFT) strategies, such as LoRA, have been developed. In this paper, we delve into the concept of task-specific directions (TSDs)-critical for transitioning large models from pretrained states to task-specific enhancements in PEFT. We propose a framework to clearly define these directions and explore their properties, and practical utilization challenges. We then introduce a novel approach, LoRA-Dash, which aims to maximize the impact of TSDs during the fine-tuning process, thereby enhancing model performance on targeted tasks. Extensive experiments have conclusively demonstrated the effectiveness of LoRA-Dash, and in-depth analyses further reveal the underlying mechanisms of LoRA-Dash. The code is available at https://github.com/Chongjie-Si/Subspace-Tuning.