AdaSVD: Adaptive Singular Value Decomposition for Large Language Models

📅 2025-02-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high memory overhead and accuracy degradation of large language models (LLMs) when deployed on resource-constrained devices, this paper proposes adaComp, an adaptive singular value decomposition (SVD) compression framework. Moving beyond conventional uniform truncation strategies, adaComp introduces a dynamic error compensation mechanism and a layer-wise importance-aware compression ratio (adaCR) allocation scheme. Layer importance is quantified jointly via gradient sensitivity and singular value energy distribution, while alternating optimization of the U and Vᵀ matrices mitigates accumulated truncation errors. Extensive evaluation across diverse LLMs—including Llama and Qwen—demonstrates that adaComp achieves, on average, a 42% reduction in memory footprint and a 68% decrease in accuracy loss compared to state-of-the-art SVD-based methods, while maintaining over 98% of original task performance.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have achieved remarkable success in natural language processing (NLP) tasks, yet their substantial memory requirements present significant challenges for deployment on resource-constrained devices. Singular Value Decomposition (SVD) has emerged as a promising compression technique for LLMs, offering considerable reductions in memory overhead. However, existing SVD-based methods often struggle to effectively mitigate the errors introduced by SVD truncation, leading to a noticeable performance gap when compared to the original models. Furthermore, applying a uniform compression ratio across all transformer layers fails to account for the varying importance of different layers. To address these challenges, we propose AdaSVD, an adaptive SVD-based LLM compression approach. Specifically, AdaSVD introduces adaComp, which adaptively compensates for SVD truncation errors by alternately updating the singular matrices U and V^T. Additionally, AdaSVD introduces adaCR, which adaptively assigns layer-specific compression ratios based on the relative importance of each layer. Extensive experiments across multiple LLM families and evaluation metrics demonstrate that AdaSVD consistently outperforms state-of-the-art (SOTA) SVD-based methods, achieving superior performance with significantly reduced memory requirements. The code and models will be available at https://github.com/ZHITENGLI/AdaSVD.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Memory Efficiency
Differential Compression
Innovation

Methods, ideas, or system contributions that make the work stand out.

AdaSVD
Memory-efficient
Model Compression
🔎 Similar Papers
No similar papers found.