🤖 AI Summary
To address the high computational overhead, environmental cost, and poor generalization of large language models (LLMs) in low-resource settings, this paper proposes a novel “recycle-and-reuse” paradigm for LoRA adapters. Specifically, it maps heterogeneous pre-trained LoRA modules into a unified principal subspace, establishing a shareable domain-knowledge foundation; within this subspace, task adaptation is achieved by learning only lightweight coefficients—augmented with orthogonal basis constraints—to enable zero-parameter expansion. The method integrates low-rank decomposition, PCA-based subspace alignment, and orthogonal basis construction, constituting the first parameter-efficient fine-tuning (PEFT) framework supporting cross-task knowledge transfer without parameter growth. Experiments demonstrate that our approach reduces training and inference parameter counts and memory footprint by 72% on average, while maintaining state-of-the-art performance across diverse domains—enabling efficient edge deployment and democratized LLM adaptation.
📝 Abstract
The rapid growth of large models has raised concerns about their environmental impact and equity in accessibility due to significant computational costs. Low-Rank Adapters (LoRA) offer a lightweight solution for finetuning large models, resulting in an abundance of publicly available adapters tailored to diverse domains. We ask: Can these pretrained adapters be leveraged to further streamline adaptation to new tasks while addressing these challenges? We introduce EigenLoRAx, a parameter-efficient finetuning method that recycles existing adapters to create a principal subspace aligned with their shared domain knowledge which can be further augmented with orthogonal basis vectors in low-resource scenarios. This enables rapid adaptation to new tasks by learning only lightweight coefficients on the principal components of the subspace - eliminating the need to finetune entire adapters. EigenLoRAx requires significantly fewer parameters and memory, improving efficiency for both training and inference. Our method demonstrates strong performance across diverse domains and tasks, offering a scalable for edge-based applications, personalization, and equitable deployment of large models in resource-constrained environments.