Towards Universal & Efficient Model Compression via Exponential Torque Pruning

📅 2025-06-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the substantial computational and memory overheads caused by oversized modern deep neural networks, this paper proposes an efficient structured pruning method based on exponential force-field regularization. Unlike conventional linear torque regularization, we introduce, for the first time, an exponentially decaying force function to impose weight constraints—redundant modules farther from pivot points experience stronger compressive forces, thereby significantly enhancing pruning selectivity and automation. The method jointly optimizes weights and architecture during training, achieving up to 10.3× model compression across diverse vision and language tasks while incurring less than 0.3% accuracy degradation—outperforming current state-of-the-art approaches. Key contributions include: (1) a novel exponential force-field regularization mechanism that dynamically modulates structural sparsity; and (2) an end-to-end trainable structured pruning framework delivering high fidelity, high compression ratios, and seamless integration into standard training pipelines.

Technology Category

Application Category

📝 Abstract
The rapid growth in complexity and size of modern deep neural networks (DNNs) has increased challenges related to computational costs and memory usage, spurring a growing interest in efficient model compression techniques. Previous state-of-the-art approach proposes using a Torque-inspired regularization which forces the weights of neural modules around a selected pivot point. Whereas, we observe that the pruning effect of this approach is far from perfect, as the post-trained network is still dense and also suffers from high accuracy drop. In this work, we attribute such ineffectiveness to the default linear force application scheme, which imposes inappropriate force on neural module of different distances. To efficiently prune the redundant and distant modules while retaining those that are close and necessary for effective inference, in this work, we propose Exponential Torque Pruning (ETP), which adopts an exponential force application scheme for regularization. Experimental results on a broad range of domains demonstrate that, though being extremely simple, ETP manages to achieve significantly higher compression rate than the previous state-of-the-art pruning strategies with negligible accuracy drop.
Problem

Research questions and friction points this paper is trying to address.

Addresses inefficient pruning in dense neural networks
Reduces high accuracy drop in model compression
Improves compression rates with exponential force scheme
Innovation

Methods, ideas, or system contributions that make the work stand out.

Exponential force application for regularization
Higher compression rate with negligible accuracy drop
Efficient pruning of redundant distant modules
🔎 Similar Papers
No similar papers found.
S
Sarthak K. Modi
Nanyang Technological University
L
Lim Zi Pong
Continental Automotive Singapore
S
Shourya Kuchhal
Nanyang Technological University
Yushi Cao
Yushi Cao
Nanyang Technological University
Deep Reinforcement LearningTrustworthy AI
Y
Yupeng Cheng
Nanyang Technological University
T
Teo Yon Shin
Continental Automotive Singapore
L
Lin Shang-Wei
Singapore Institute of Technology
Zhiming Li
Zhiming Li
Central South University
Materials designMaterials processingMicrostructureMaterials PropertiesPhysical Metallurgy