The Quest for Winning Tickets in Low-Rank Adapters

📅 2025-12-27

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

This work is the first to validate the Lottery Ticket Hypothesis (LTH) in the context of Parameter-Efficient Fine-Tuning (PEFT), specifically investigating whether Low-Rank Adaptation (LoRA) contains sparse “winning tickets” that match full fine-tuning performance. Method: We identify that inter-layer sparsity distribution—not precise weight selection—is critical, and that task-relevant subspaces harbor trainable sparse low-rank structures. Building on this, we propose Partial-LoRA: a layer-aware, task-driven structured sparsity paradigm for LoRA training, integrating subspace alignment and multi-task joint pruning. Contribution/Results: Evaluated on 8 vision and 12 language tasks, Partial-LoRA reduces LoRA parameters by up to 87% while maintaining or improving accuracy. It significantly enhances both training efficiency and deployment practicality of LoRA, offering the first empirical confirmation of LTH in PEFT and establishing a principled framework for sparse, adaptive low-rank learning.

Technology Category

Application Category

📝 Abstract

The Lottery Ticket Hypothesis (LTH) suggests that over-parameterized neural networks contain sparse subnetworks ("winning tickets") capable of matching full model performance when trained from scratch. With the growing reliance on fine-tuning large pretrained models, we investigate whether LTH extends to parameter-efficient fine-tuning (PEFT), specifically focusing on Low-Rank Adaptation (LoRA) methods. Our key finding is that LTH holds within LoRAs, revealing sparse subnetworks that can match the performance of dense adapters. In particular, we find that the effectiveness of sparse subnetworks depends more on how much sparsity is applied in each layer than on the exact weights included in the subnetwork. Building on this insight, we propose Partial-LoRA, a method that systematically identifies said subnetworks and trains sparse low-rank adapters aligned with task-relevant subspaces of the pre-trained model. Experiments across 8 vision and 12 language tasks in both single-task and multi-task settings show that Partial-LoRA reduces the number of trainable parameters by up to 87%, while maintaining or improving accuracy. Our results not only deepen our theoretical understanding of transfer learning and the interplay between pretraining and fine-tuning but also open new avenues for developing more efficient adaptation strategies.

Problem

Research questions and friction points this paper is trying to address.

Finding sparse subnetworks in LoRA for efficient fine-tuning

Investigating Lottery Ticket Hypothesis in parameter-efficient fine-tuning methods

Proposing Partial-LoRA to reduce trainable parameters while maintaining accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Identifies sparse subnetworks in LoRA adapters

Proposes Partial-LoRA for task-relevant subspace alignment

Reduces trainable parameters by up to 87%

🔎 Similar Papers

No similar papers found.