Theoretical Compression Bounds for Wide Multilayer Perceptrons

📅 2025-12-05

📈 Citations: 0

✨ Influential: 0

career value

266K/year

🤖 AI Summary

Despite the empirical success of pruning and quantization for large neural networks, a rigorous theoretical foundation—particularly under data-free assumptions—has been lacking. Method: This paper establishes, for the first time, the existence of high-performance pruned/quantized subnetworks in wide multilayer perceptrons (MLPs) without requiring training data. It introduces a post-training randomized greedy compression algorithm that unifies analysis of pruning (including structured pruning) and quantization, and rigorously characterizes their fundamental compression limits. Contribution/Results: The authors prove that, for sufficiently wide MLPs, there exist subnetworks with significantly reduced parameter count yet nearly unchanged accuracy. They further extend this theoretical framework to convolutional neural networks (CNNs), deriving quantitative trade-offs between compressibility and network width. This work provides the first unified, rigorous, and practically relevant theoretical foundation for wide-network compression, bridging a critical gap between empirical model compression performance and theoretical understanding.

Technology Category

Application Category

📝 Abstract

Pruning and quantization techniques have been broadly successful in reducing the number of parameters needed for large neural networks, yet theoretical justification for their empirical success falls short. We consider a randomized greedy compression algorithm for pruning and quantization post-training and use it to rigorously show the existence of pruned/quantized subnetworks of multilayer perceptrons (MLPs) with competitive performance. We further extend our results to structured pruning of MLPs and convolutional neural networks (CNNs), thus providing a unified analysis of pruning in wide networks. Our results are free of data assumptions, and showcase a tradeoff between compressibility and network width. The algorithm we consider bears some similarities with Optimal Brain Damage (OBD) and can be viewed as a post-training randomized version of it. The theoretical results we derive bridge the gap between theory and application for pruning/quantization, and provide a justification for the empirical success of compression in wide multilayer perceptrons.

Problem

Research questions and friction points this paper is trying to address.

Theoretical justification lacking for pruning and quantization success

Existence of pruned/quantized subnetworks with competitive performance proven

Tradeoff between compressibility and network width analyzed

Innovation

Methods, ideas, or system contributions that make the work stand out.

Randomized greedy algorithm for pruning and quantization

Extends to structured pruning in MLPs and CNNs

Bridges theory and application with data-free analysis

🔎 Similar Papers

No similar papers found.