Exchangeability in Neural Network Architectures and its Application to Dynamic Pruning

📅 2025-06-02

📈 Citations: 0

✨ Influential: 0

career value

156K/year

🤖 AI Summary

Statistical exchangeability among parameters and intermediate activations in neural networks induces symmetric redundancy, leading to computational inefficiency—yet existing compression methods fail to systematically exploit this property. This paper formally introduces statistical exchangeability theory into neural network analysis and proposes ExPrune, an input-adaptive dynamic pruning framework. ExPrune identifies interchangeable neurons via exchangeability modeling, performs input-conditioned dynamic pruning during inference, and synergistically integrates with static pruning. Its methodology encompasses exchangeability-aware neuron modeling, dynamic neuron pruning, ReLU negative-input prediction, and cross-modal generalization (CV, graph, NLP). Evaluated on four model families, ExPrune reduces FLOPs by 10.98–26.3% with negligible accuracy loss (<0.1%), or by 21.01–39.05% with ≤1% accuracy degradation. Moreover, it achieves additional FLOPs savings of 10.24–14.39% when applied atop statically pruned models.

Technology Category

Application Category

📝 Abstract

Neural networks (NNs) are equipped with increasingly many parameters and require more and more resource for deployment. Researchers have explored various ways to improve the efficiency of NNs by identifying and reducing the redundancy, such as pruning or quantizing unimportant weights. Symmetry in the NN architectures has been identified by prior work as a possible type of redundancy, but exploiting it for efficient inference is not yet explored. In this work, we formalize the symmetry of parameters and intermediate values in NNs using the statistical property of exchangeablility. We identify that exchangeable values in NN computation may contain overlapping information, leading to redundancy. Exploiting the insight, we derive a principled general dynamic pruning algorithm ExPrune to remove symmetry-induced redundancy on a per-input basis. We also provide an instantiation of ExPrune that performs neuron-level dynamic pruning by predicting negative inputs to ReLU activations. We evaluate ExPrune on two computer vision models, one graph model and one language model. ExPrune provides 10.98--26.3% reduction in FLOPs with negligible accuracy drop and 21.01--39.05% reduction in FLOPs with at most 1% accuracy drop. We also demonstrate that ExPrune composes with static pruning. On models that have been aggressively pruned statically, ExPrune provides additional 10.24--11.11% reduction in FLOPs with negligible accuracy drop and 13.91--14.39% reduction in FLOPs with at most 1% accuracy drop.

Problem

Research questions and friction points this paper is trying to address.

Exploiting exchangeability in NNs to reduce redundancy

Dynamic pruning algorithm for symmetry-induced redundancy

Improving NN efficiency with negligible accuracy drop

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses exchangeability to identify NN redundancy

Develops dynamic pruning algorithm ExPrune

Predicts negative ReLU inputs for pruning

🔎 Similar Papers

SequentialAttention++ for Block Sparsification: Differentiable Pruning Meets Combinatorial Optimization