🤖 AI Summary
This work addresses the inefficiency in existing Low-Rank Adaptation (LoRA) methods arising from the lack of coordinated optimization among rank allocation, scaling factors, and initialization. The authors propose TLoRA, a novel framework that, during early training, initializes the A matrix into a task-relevant subspace via singular value decomposition of the product between input activation covariances and pre-trained weights, then freezes A and trains only the B matrix. Simultaneously, TLoRA adaptively assigns per-layer ranks and scaling factors based on parameter sensitivity. This approach uniquely unifies initialization and resource allocation in LoRA under a task-aware paradigm, enabling highly efficient fine-tuning within a fixed parameter budget. Experiments demonstrate that TLoRA consistently achieves superior performance across diverse tasks—including natural language understanding, commonsense and mathematical reasoning, code generation, and dialogue—while substantially reducing the number of trainable parameters.
📝 Abstract
Low-Rank Adaptation (LoRA) has become a widely adopted parameter-efficient fine-tuning method for large language models, with its effectiveness largely influenced by the allocation of ranks and scaling factors, as well as initialization. Existing LoRA variants typically address only one of these factors, often at the cost of increased training complexity or reduced practical efficiency. In this work, we present Task-aware Low-Rank Adaptation (TLoRA), a unified framework that jointly optimizes initialization and resource allocation at the outset of training. TLoRA introduces a data-driven initialization strategy that aligns the LoRA $A$ matrix with task-relevant subspaces by performing singular value decomposition on the product of pre-trained weights and input activation covariance. After this, the $A$ matrix is frozen, and only the $B$ matrix is trained. Furthermore, TLoRA employs a sensitivity-based importance metric to adaptively allocate ranks and scaling factors across layers under a fixed parameter budget. We conduct extensive experiments that demonstrate TLoRA consistently performs excellently across various tasks, including natural language understanding, commonsense reasoning, math reasoning, code generation, and chat generation, while significantly reducing the number of trainable parameters.