🤖 AI Summary
To address the need for efficient, fine-tuning-free compression in large-model deployment, this paper proposes a smoothness-driven structured compression method. During training, it jointly imposes nuclear-norm regularization and penalties on the first- and second-order weight derivatives, explicitly inducing structural smoothness in network parameters—thereby enabling natural compatibility with low-rank approximation and structured pruning. This work is the first to systematically integrate smoothness regularization into the model compression training pipeline, eliminating reliance on post-compression fine-tuning—a hallmark of conventional approaches. Evaluated on CIFAR-10, the smoothed ResNet-18 achieves 91% accuracy with 70% parameter reduction, substantially outperforming existing fine-tuning-free methods and attaining state-of-the-art performance.
📝 Abstract
Compressing and pruning large machine learning models has become a critical step towards their deployment in real-world applications. Standard pruning and compression techniques are typically designed without taking the structure of the network's weights into account, limiting their effectiveness. We explore the impact of smooth regularization on neural network training and model compression. By applying nuclear norm, first- and second-order derivative penalties of the weights during training, we encourage structured smoothness while preserving predictive performance on par with non-smooth models. We find that standard pruning methods often perform better when applied to these smooth models. Building on this observation, we apply a Singular-Value-Decomposition-based compression method that exploits the underlying smooth structure and approximates the model's weight tensors by smaller low-rank tensors. Our approach enables state-of-the-art compression without any fine-tuning - reaching up to $91%$ accuracy on a smooth ResNet-18 on CIFAR-10 with $70%$ fewer parameters.