🤖 AI Summary
Existing parameter-efficient fine-tuning (PEFT) methods, such as LoRA, struggle to adequately capture the weight updates of full fine-tuning and still require storing large amounts of frozen parameters, limiting memory efficiency. This work proposes a novel paradigm based on singular value decomposition (SVD), introducing for the first time a principal component subspace that preserves cumulative spectral energy into PEFT. By performing updates exclusively within a low-rank subspace retaining only 90%–95% of the spectral energy, the method overcomes the representational limitations of conventional low-rank adaptation. It achieves superior performance over state-of-the-art PEFT approaches across diverse tasks—including image classification, text-to-image generation, and natural language understanding—while simultaneously reducing memory overhead and maintaining an optimal balance between efficacy and efficiency.
📝 Abstract
To mitigate the memory constraints associated with fine-tuning large pre-trained models, existing parameter-efficient fine-tuning (PEFT) methods, such as LoRA, rely on low-rank updates. However, such updates fail to fully capture the rank characteristics of the weight modifications observed in full-parameter fine-tuning, resulting in a performance gap. Furthermore, LoRA and other existing PEFT methods still require substantial memory to store the full set of frozen weights, limiting their efficiency in resource-constrained settings. To addres these limitations, we introduce Cumulative Energy-Retaining Subspace Adaptation (CERSA), a novel fine-tuning paradigm that leverages singular value decomposition (SVD) to retain only the principal components responsible for 90% to 95% of the spectral energy. By fine-tuning low-rank representations derived from this principal subspace, CERSA significantly reduces memory consumption. We conduct extensive evaluations of CERSA across models of varying scales and domains, including image recognition, text-to-image generation, and natural language understanding. Empirical results demonstrate that CERSA consistently outperforms state-of-the-art PEFT methods while achieving substantially lower memory requirements. The code will be publicly released.