DipSVD: Dual-importance Protected SVD for Efficient LLM Compression

📅 2025-06-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing SVD-based compression methods for large language models (LLMs) often neglect the protection of critical singular components, leading to significant performance degradation. To address this, we propose a dual-importance preservation mechanism: (1) globally, dynamically allocating layer-wise compression ratios to balance computational burden across layers; and (2) locally, enhancing retention of salient singular vectors via channel-weighted data whitening. Our approach preserves the hardware compatibility and theoretical interpretability inherent to SVD while substantially improving compression robustness. Extensive experiments demonstrate that our method consistently outperforms state-of-the-art SVD compression baselines across multiple benchmarks. Notably, it maintains strong performance even under extreme compression—e.g., retaining only 20% of the original singular values—thereby establishing a new paradigm for efficient and reliable LLM deployment.

Technology Category

Application Category

📝 Abstract
The ever-increasing computational demands and deployment costs of large language models (LLMs) have spurred numerous compressing methods. Compared to quantization and unstructured pruning, SVD compression offers superior hardware compatibility and theoretical guarantees. However, existing SVD-based methods focus on the overall discrepancy between the original and compressed matrices while overlooking the protection of critical components within the matrix, which leads to inferior performance in the compressed models. This paper proposes a dual-level importance protection mechanism to enhance SVD-based compression methods: (1) local importance protection: preserving the most critical singular vectors within each weight matrix through channel-weighted data whitening; and (2) global importance protection: enabling less important layers to bear a greater portion of the compression burden through either a heuristic or optimization-based approach, thereby minimizing the impact of compression on critical layers. Extensive experiments demonstrate that DipSVD outperforms existing SVD-based compression approaches across multiple benchmarks, achieving superior model performance especially at high model compression ratios.
Problem

Research questions and friction points this paper is trying to address.

Protects critical matrix components in SVD compression
Balances compression burden across less important layers
Improves model performance at high compression ratios
Innovation

Methods, ideas, or system contributions that make the work stand out.

Channel-weighted whitening for local importance protection
Heuristic or optimization-based global importance protection
Dual-level importance protection in SVD compression
🔎 Similar Papers
No similar papers found.
X
Xuan Ding
Beijing Normal University
R
Rui Sun
The Chinese University of HongKong, Shenzhen
Y
Yunjian Zhang
University of Chinese Academy of Sciences
X
Xiu Yan
Tsinghua University
Y
Yueqi Zhou
Beijing Normal University
K
Kaihao Huang
Beijing Normal University
S
Suzhong Fu
The Chinese University of HongKong, Shenzhen
Chuanlong Xie
Chuanlong Xie
Beijing Normal University
Y
Yao Zhu
Zhejiang University