C-SWAP: Explainability-Aware Structured Pruning for Efficient Neural Networks Compression

📅 2025-10-21

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

Existing one-shot structured pruning methods—requiring no fine-tuning—struggle to distinguish critical from redundant structures, often leading to significant accuracy degradation. Method: This paper proposes Causal-Aware Pruning (CAP), the first framework to integrate causal inference into structured pruning. CAP employs interpretability analysis to quantify the causal contribution of neurons or modules to model predictions, guiding progressive, fine-grained structural removal without post-pruning fine-tuning. Contribution/Results: Evaluated on convolutional networks and vision Transformers for image classification, CAP achieves up to 60% parameter compression while incurring only 0.3–0.8% Top-1 accuracy drop—substantially outperforming state-of-the-art fine-tuning-free pruning approaches. Its core innovation lies in establishing a closed-loop pipeline—Attribution → Causal Inference → Pruning—that jointly optimizes compression ratio and task performance, enabling more reliable and efficient model slimming.

Technology Category

Application Category

📝 Abstract

Neural network compression has gained increasing attention in recent years, particularly in computer vision applications, where the need for model reduction is crucial for overcoming deployment constraints. Pruning is a widely used technique that prompts sparsity in model structures, e.g. weights, neurons, and layers, reducing size and inference costs. Structured pruning is especially important as it allows for the removal of entire structures, which further accelerates inference time and reduces memory overhead. However, it can be computationally expensive, requiring iterative retraining and optimization. To overcome this problem, recent methods considered one-shot setting, which applies pruning directly at post-training. Unfortunately, they often lead to a considerable drop in performance. In this paper, we focus on this issue by proposing a novel one-shot pruning framework that relies on explainable deep learning. First, we introduce a causal-aware pruning approach that leverages cause-effect relations between model predictions and structures in a progressive pruning process. It allows us to efficiently reduce the size of the network, ensuring that the removed structures do not deter the performance of the model. Then, through experiments conducted on convolution neural network and vision transformer baselines, pre-trained on classification tasks, we demonstrate that our method consistently achieves substantial reductions in model size, with minimal impact on performance, and without the need for fine-tuning. Overall, our approach outperforms its counterparts, offering the best trade-off. Our code is available on GitHub.

Problem

Research questions and friction points this paper is trying to address.

Addressing performance drop in one-shot neural network pruning methods

Reducing computational costs of structured pruning without fine-tuning

Maintaining model accuracy while removing structures via explainable AI

Innovation

Methods, ideas, or system contributions that make the work stand out.

One-shot pruning using explainable deep learning

Causal-aware pruning based on cause-effect relations

Progressive pruning without fine-tuning for efficiency

🔎 Similar Papers

SequentialAttention++ for Block Sparsification: Differentiable Pruning Meets Combinatorial Optimization