Growing Winning Subnetworks, Not Pruning Them: A Paradigm for Density Discovery in Sparse Neural Networks

📅 2025-09-29

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

Existing sparse training methods (e.g., iterative pruning) rely on pre-specified target sparsity levels or incur high retraining costs. This paper proposes a novel “sparse-to-dense growth” paradigm that abandons pruning entirely: edges are incrementally grown guided by path-weight products, structural bottlenecks are mitigated via stochasticity, and convergence is autonomously determined upon accuracy plateauing—enabling end-to-end discovery of the optimal operational sparsity. Its core innovation is the first growth-based sparsity exploration mechanism, eliminating the need for predefined density and enabling dynamic trade-offs between accuracy and computational complexity. Evaluated on CIFAR-10/100, TinyImageNet, and ImageNet, the method achieves performance comparable to Iterative Magnitude Pruning (IMP) lottery tickets at marginally higher sparsity, while requiring only ~1.5× the cost of dense training—just one-third to one-half that of IMP.

Technology Category

Application Category

📝 Abstract

The lottery ticket hypothesis suggests that dense networks contain sparse subnetworks that can be trained in isolation to match full-model performance. Existing approaches-iterative pruning, dynamic sparse training, and pruning at initialization-either incur heavy retraining costs or assume the target density is fixed in advance. We introduce Path Weight Magnitude Product-biased Random growth (PWMPR), a constructive sparse-to-dense training paradigm that grows networks rather than pruning them, while automatically discovering their operating density. Starting from a sparse seed, PWMPR adds edges guided by path-kernel-inspired scores, mitigates bottlenecks via randomization, and stops when a logistic-fit rule detects plateauing accuracy. Experiments on CIFAR, TinyImageNet, and ImageNet show that PWMPR approaches the performance of IMP-derived lottery tickets-though at higher density-at substantially lower cost (~1.5x dense vs. 3-4x for IMP). These results establish growth-based density discovery as a promising paradigm that complements pruning and dynamic sparsity.

Problem

Research questions and friction points this paper is trying to address.

Discovering optimal density in sparse neural networks without pruning

Reducing retraining costs compared to iterative pruning methods

Automatically determining network density during training process

Innovation

Methods, ideas, or system contributions that make the work stand out.

Grows sparse networks via path-kernel-guided random growth

Automatically discovers optimal density using logistic-fit rule

Achieves high performance with lower training cost than pruning

🔎 Similar Papers

Always-Sparse Training by Growing Connections with Guided Stochastic Exploration

2024-01-12arXiv.orgCitations: 0

Nvidia

192,000 USD - 304,750 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5

US, CA, Santa Clara / US, WA, Seattle

Research Scientist, AI & Systems Co-design (PhD)