Low-Rank Prehab: Preparing Neural Networks for SVD Compression

📅 2025-12-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the significant performance degradation of models after SVD-based compression, this paper proposes Low-Rank Prehab—a pre-compression low-rank pre-finetuning method. Its core innovation is the introduction of a “pre-compression adaptation” mechanism that explicitly optimizes the spectral compactness of weight matrices prior to SVD decomposition, thereby inducing inherent low-rank structure in model parameters. Unlike architecture-specific approaches, Low-Rank Prehab is broadly applicable to Transformer-based models, including LLMs and ViTs. It integrates task-aware pre-finetuning with lightweight post-compression fine-tuning to jointly enhance compression robustness and recovery efficiency. Experiments across multiple compression ratios demonstrate that Low-Rank Prehab substantially outperforms baselines such as SVD-LLM, achieving markedly reduced accuracy loss on benchmarks including ImageNet and GLUE. Moreover, it improves model compressibility and accelerates fine-tuning convergence after SVD compression.

Technology Category

Application Category

📝 Abstract
Low-rank approximation methods such as singular value decomposition (SVD) and its variants (e.g., Fisher-weighted SVD, Activation SVD) have recently emerged as effective tools for neural network compression. In this setting, decomposition acts as a "surgical" intervention, followed by fine-tuning that serves as "rehab" to recover accuracy. Inspired by prehabilitation in surgery, we introduce a pre-compression fine-tuning stage, Low-Rank Prehab, that explicitly encourages low-rank structure in weight matrices while preserving task performance. By conditioning the model before SVD, Prehab steers weights toward spectrally compact regions of the parameter space, enabling smoother low-rank approximation and improved recovery. Experiments on large language models (LLMs) and other Transformer-based architectures, including Vision Transformers (ViTs), show that Prehab substantially reduces the immediate accuracy drop after compression and consistently improves post-finetuning performance. Across a wide range of compression ratios, our method outperforms state-of-the-art SVD-based techniques such as SVD-LLM, highlighting the importance of preparing models for compression rather than only improving the compression and recovery stages. Source code is available at https://github.com/niqretnuh/PREHAB-SVD
Problem

Research questions and friction points this paper is trying to address.

Prepares neural networks for SVD compression via pre-fine-tuning
Reduces accuracy drop after compression in Transformers and LLMs
Outperforms existing SVD methods across various compression ratios
Innovation

Methods, ideas, or system contributions that make the work stand out.

Pre-compression fine-tuning encourages low-rank weight structure
Conditions models for smoother SVD approximation and improved recovery
Reduces accuracy drop and outperforms existing SVD-based compression methods
🔎 Similar Papers
No similar papers found.
H
Haoran Qin
Department of Computer Science, Vanderbilt University, TN, USA
S
Shansita Sharma
Department of Computer Science, Vanderbilt University, TN, USA
A
Ali Abbasi
Department of Computer Science, Vanderbilt University, TN, USA
C
Chayne Thrash
Department of Computer Science, Vanderbilt University, TN, USA
Soheil Kolouri
Soheil Kolouri
Computer Science, Vanderbilt University, Nashville, TN
Machine LearningOptimal TransportComputer Vision