Growing Winning Subnetworks, Not Pruning Them: A Paradigm for Density Discovery in Sparse Neural Networks

📅 2025-09-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing sparse training methods (e.g., iterative pruning) rely on pre-specified target sparsity levels or incur high retraining costs. This paper proposes a novel “sparse-to-dense growth” paradigm that abandons pruning entirely: edges are incrementally grown guided by path-weight products, structural bottlenecks are mitigated via stochasticity, and convergence is autonomously determined upon accuracy plateauing—enabling end-to-end discovery of the optimal operational sparsity. Its core innovation is the first growth-based sparsity exploration mechanism, eliminating the need for predefined density and enabling dynamic trade-offs between accuracy and computational complexity. Evaluated on CIFAR-10/100, TinyImageNet, and ImageNet, the method achieves performance comparable to Iterative Magnitude Pruning (IMP) lottery tickets at marginally higher sparsity, while requiring only ~1.5× the cost of dense training—just one-third to one-half that of IMP.

Technology Category

Application Category

📝 Abstract
The lottery ticket hypothesis suggests that dense networks contain sparse subnetworks that can be trained in isolation to match full-model performance. Existing approaches-iterative pruning, dynamic sparse training, and pruning at initialization-either incur heavy retraining costs or assume the target density is fixed in advance. We introduce Path Weight Magnitude Product-biased Random growth (PWMPR), a constructive sparse-to-dense training paradigm that grows networks rather than pruning them, while automatically discovering their operating density. Starting from a sparse seed, PWMPR adds edges guided by path-kernel-inspired scores, mitigates bottlenecks via randomization, and stops when a logistic-fit rule detects plateauing accuracy. Experiments on CIFAR, TinyImageNet, and ImageNet show that PWMPR approaches the performance of IMP-derived lottery tickets-though at higher density-at substantially lower cost (~1.5x dense vs. 3-4x for IMP). These results establish growth-based density discovery as a promising paradigm that complements pruning and dynamic sparsity.
Problem

Research questions and friction points this paper is trying to address.

Discovering optimal density in sparse neural networks without pruning
Reducing retraining costs compared to iterative pruning methods
Automatically determining network density during training process
Innovation

Methods, ideas, or system contributions that make the work stand out.

Grows sparse networks via path-kernel-guided random growth
Automatically discovers optimal density using logistic-fit rule
Achieves high performance with lower training cost than pruning
🔎 Similar Papers
No similar papers found.