Learning effective pruning at initialization from iterative pruning

📅 2024-08-27

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

205K/year

🤖 AI Summary

Pruning-at-Initializaton (PaI) methods suffer significant performance degradation under high sparsity compared to iterative pruning. Method: This paper proposes AutoSparse—the first end-to-end learnable PaI framework—that models the mapping between initial parameter features and their survival probabilities in Iterative Rewinding Pruning (IRP), learning a generalizable pruning score function via neural networks. Leveraging IRP-revealed correlations between initialization states and subnet importance, AutoSparse enables cross-architecture (e.g., ResNet-18 → VGG-16) and cross-dataset (e.g., CIFAR-10 → TinyImageNet) transfer. Contribution/Results: Experiments demonstrate that AutoSparse substantially outperforms existing PaI methods at high sparsity, achieving strong generalization with only a single IRP training run—thereby significantly reducing computational overhead while maintaining accuracy.

Technology Category

Application Category

📝 Abstract

Pruning at initialization (PaI) reduces training costs by removing weights before training, which becomes increasingly crucial with the growing network size. However, current PaI methods still have a large accuracy gap with iterative pruning, especially at high sparsity levels. This raises an intriguing question: can we get inspiration from iterative pruning to improve the PaI performance? In the lottery ticket hypothesis, the iterative rewind pruning (IRP) finds subnetworks retroactively by rewinding the parameter to the original initialization in every pruning iteration, which means all the subnetworks are based on the initial state. Here, we hypothesise the surviving subnetworks are more important and bridge the initial feature and their surviving score as the PaI criterion. We employ an end-to-end neural network ( extbf{AutoS}parse) to learn this correlation, input the model's initial features, output their score and then prune the lowest score parameters before training. To validate the accuracy and generalization of our method, we performed PaI across various models. Results show that our approach outperforms existing methods in high-sparsity settings. Notably, as the underlying logic of model pruning is consistent in different models, only one-time IRP on one model is needed (e.g., once IRP on ResNet-18/CIFAR-10, AutoS can be generalized to VGG-16/CIFAR-10, ResNet-18/TinyImageNet, et al.). As the first neural network-based PaI method, we conduct extensive experiments to validate the factors influencing this approach. These results reveal the learning tendencies of neural networks and provide new insights into our understanding and research of PaI from a practical perspective. Our code is available at: https://github.com/ChengYaofeng/AutoSparse.git.

Problem

Research questions and friction points this paper is trying to address.

Improves pruning at initialization by learning from iterative pruning methods

Reduces accuracy gap in high-sparsity neural network pruning before training

Generalizes pruning criteria across models using a neural network-based approach

Innovation

Methods, ideas, or system contributions that make the work stand out.

AutoSparse learns pruning scores from iterative rewind patterns

Neural network predicts importance using initial weight features

One-time training generalizes across models and datasets

🔎 Similar Papers

No similar papers found.