Smooth Model Compression without Fine-Tuning

📅 2025-05-30

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

To address the need for efficient, fine-tuning-free compression in large-model deployment, this paper proposes a smoothness-driven structured compression method. During training, it jointly imposes nuclear-norm regularization and penalties on the first- and second-order weight derivatives, explicitly inducing structural smoothness in network parameters—thereby enabling natural compatibility with low-rank approximation and structured pruning. This work is the first to systematically integrate smoothness regularization into the model compression training pipeline, eliminating reliance on post-compression fine-tuning—a hallmark of conventional approaches. Evaluated on CIFAR-10, the smoothed ResNet-18 achieves 91% accuracy with 70% parameter reduction, substantially outperforming existing fine-tuning-free methods and attaining state-of-the-art performance.

Technology Category

Application Category

📝 Abstract

Compressing and pruning large machine learning models has become a critical step towards their deployment in real-world applications. Standard pruning and compression techniques are typically designed without taking the structure of the network's weights into account, limiting their effectiveness. We explore the impact of smooth regularization on neural network training and model compression. By applying nuclear norm, first- and second-order derivative penalties of the weights during training, we encourage structured smoothness while preserving predictive performance on par with non-smooth models. We find that standard pruning methods often perform better when applied to these smooth models. Building on this observation, we apply a Singular-Value-Decomposition-based compression method that exploits the underlying smooth structure and approximates the model's weight tensors by smaller low-rank tensors. Our approach enables state-of-the-art compression without any fine-tuning - reaching up to $91%$ accuracy on a smooth ResNet-18 on CIFAR-10 with $70%$ fewer parameters.

Problem

Research questions and friction points this paper is trying to address.

Compressing large ML models without fine-tuning

Improving pruning via smooth weight regularization

Enhancing compression via low-rank tensor approximation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses smooth regularization for model compression

Applies nuclear norm and derivative penalties

Employs SVD-based low-rank tensor approximation

🔎 Similar Papers

MCNC: Manifold-Constrained Reparameterization for Neural Compression