Structured vs. Unstructured Pruning: An Exponential Gap

📅 2026-02-13

📈 Citations: 0

✨ Influential: 0

career value

235K/year

🤖 AI Summary

This work investigates the expressive efficiency gap between structured pruning (i.e., neuron pruning) and unstructured weight pruning under the Strong Lottery Ticket Hypothesis (SLTH), focusing on the sample complexity of approximating a single bias-free ReLU neuron using a randomly initialized two-layer ReLU network without any training. By leveraging geometric properties of ReLU networks and tools from approximation theory, the authors establish the first rigorous lower bound showing that neuron pruning requires Ω(d/ε) hidden neurons to achieve ε-approximation, whereas unstructured pruning suffices with only O(d log(1/ε)) weights. This exponential over-parameterization gap reveals a fundamental limitation of structured pruning in representational efficiency and fills a critical theoretical void in the SLTH literature regarding structured sparsity.

Technology Category

Application Category

📝 Abstract

The Strong Lottery Ticket Hypothesis (SLTH) posits that large, randomly initialized neural networks contain sparse subnetworks capable of approximating a target function at initialization without training, suggesting that pruning alone is sufficient. Pruning methods are typically classified as unstructured, where individual weights can be removed from the network, and structured, where parameters are removed according to specific patterns, as in neuron pruning. Existing theoretical results supporting the SLTH rely almost exclusively on unstructured pruning, showing that logarithmic overparameterization suffices to approximate simple target networks. In contrast, neuron pruning has received limited theoretical attention. In this work, we consider the problem of approximating a single bias-free ReLU neuron using a randomly initialized bias-free two-layer ReLU network, thereby isolating the intrinsic limitations of neuron pruning. We show that neuron pruning requires a starting network with $\Omega(d/\varepsilon)$ hidden neurons to $\varepsilon$-approximate a target ReLU neuron. In contrast, weight pruning achieves $\varepsilon$-approximation with only $O(d\log(1/\varepsilon))$ neurons, establishing an exponential separation between the two pruning paradigms.

Problem

Research questions and friction points this paper is trying to address.

structured pruning

unstructured pruning

Strong Lottery Ticket Hypothesis

neuron pruning

ReLU network

Innovation

Methods, ideas, or system contributions that make the work stand out.

structured pruning

unstructured pruning

Strong Lottery Ticket Hypothesis