One Shot vs. Iterative: Rethinking Pruning Strategies for Model Compression

📅 2025-08-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The relative effectiveness of one-shot pruning versus iterative pruning across varying sparsity levels remains inadequately characterized, particularly under structured and unstructured pruning regimes. Method: We conduct a systematic benchmarking study across multiple pruning criteria, modalities, and sparsity ratios, providing the first rigorous formal definitions and comprehensive empirical comparison of both paradigms. We further propose a patience-based adaptive pruning criterion and a hybrid pruning strategy that synergistically integrates strengths of both approaches. Contribution/Results: Empirical results demonstrate that one-shot pruning achieves superior accuracy at low sparsity levels, whereas iterative pruning exhibits greater robustness under high sparsity. Our hybrid method consistently attains better accuracy–compression trade-offs across diverse tasks. This work establishes theoretical foundations and practical guidelines for selecting pruning strategies under real-world deployment constraints, advancing principled model compression.

Technology Category

Application Category

📝 Abstract
Pruning is a core technique for compressing neural networks to improve computational efficiency. This process is typically approached in two ways: one-shot pruning, which involves a single pass of training and pruning, and iterative pruning, where pruning is performed over multiple cycles for potentially finer network refinement. Although iterative pruning has historically seen broader adoption, this preference is often assumed rather than rigorously tested. Our study presents one of the first systematic and comprehensive comparisons of these methods, providing rigorous definitions, benchmarking both across structured and unstructured settings, and applying different pruning criteria and modalities. We find that each method has specific advantages: one-shot pruning proves more effective at lower pruning ratios, while iterative pruning performs better at higher ratios. Building on these findings, we advocate for patience-based pruning and introduce a hybrid approach that can outperform traditional methods in certain scenarios, providing valuable insights for practitioners selecting a pruning strategy tailored to their goals and constraints. Source code is available at https://github.com/janumiko/pruning-benchmark.
Problem

Research questions and friction points this paper is trying to address.

Systematically compares one-shot and iterative neural network pruning methods
Evaluates pruning effectiveness across structured and unstructured compression settings
Identifies optimal pruning strategies based on specific compression ratios
Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematic comparison of pruning methods
Hybrid approach outperforms traditional methods
Patience-based pruning tailored to goals
🔎 Similar Papers
No similar papers found.