Effective Model Pruning

📅 2025-09-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the lack of generality in model pruning caused by manual threshold selection. We propose EMP—a context-agnostic, parameter-free pruning criterion—whose core innovation is the first introduction of the effective number $N_{ ext{eff}}$, defined via the reciprocal Simpson index, as a universal, adaptive metric for determining the number of parameters to retain. EMP requires no predefined scoring scheme or architectural assumptions and directly applies to diverse score vectors (e.g., weight magnitudes, attention scores). Leveraging simplex-geometric analysis, we theoretically derive a lower bound on the retention ratio and set the pruning threshold as $eta N_{ ext{eff}}$, where $eta = 1$ ensures robust effectiveness. Empirically, EMP achieves near-original-model performance at high sparsity levels across MLPs, CNNs, Transformers/LLMs, and KANs—without any hyperparameter tuning—demonstrating substantial improvements in pruning generality and practicality.

Technology Category

Application Category

📝 Abstract
We introduce Effective Model Pruning (EMP), a context-agnostic, parameter-free rule addressing a fundamental question about pruning: how many entries to keep. EMP does not prescribe how to score the parameters or prune the models; instead, it supplies a universal adaptive threshold that can be applied to any pruning criterion: weight magnitude, attention score, KAN importance score, or even feature-level signals such as image pixel, and used on structural parts or weights of the models. Given any score vector s, EMP maps s to a built-in effective number N_eff which is inspired by the Inverse Simpson index of contributors. Retaining the N_eff highest scoring entries and zeroing the remainder yields sparse models with performance comparable to the original dense networks across MLPs, CNNs, Transformers/LLMs, and KAN, in our experiments. By leveraging the geometry of the simplex, we derive a tight lower bound on the preserved mass s_eff (the sum of retained scores) over the corresponding ordered probability simplex associated with the score vector s. We further verify the effectiveness of N_eff by pruning the model with a scaled threshold {eta}*N_eff across a variety of criteria and models. Experiments suggest that the default {eta} = 1 yields a robust threshold for model pruning while {eta} not equal to 1 still serves as an optional adjustment to meet specific sparsity requirements.
Problem

Research questions and friction points this paper is trying to address.

Determining optimal number of entries to retain during pruning
Providing universal threshold applicable to diverse pruning criteria
Maintaining model performance while achieving sparsity across architectures
Innovation

Methods, ideas, or system contributions that make the work stand out.

Universal adaptive threshold for pruning criteria
Retains N_eff highest entries based on score vector
Derives threshold from geometry of probability simplex
🔎 Similar Papers
No similar papers found.