Utilizing dynamic sparsity on pretrained DETR

📅 2025-10-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the low inference efficiency of vision Transformers (e.g., DETR), this paper proposes a retraining-free dynamic sparsification method that exploits the inherent structural sparsity in pretrained MLP layers. The approach comprises two components: (1) Static Indicator-Based Sparsification (SIBS), which applies heuristic, fixed-pattern pruning; and (2) Micro-Gated Sparsification (MGS), which introduces lightweight linear gating modules to predict neuron activation states in real time, enabling input-adaptive sparsity of 85%–95%. Evaluated on COCO, MGS maintains or even improves mAP while substantially reducing FLOPs and latency. This work is the first to harness the intrinsic sparsity of pretrained DETR models for efficient inference, establishing a plug-and-play paradigm for compressing and deploying vision Transformers without architectural modification or retraining.

Technology Category

Application Category

📝 Abstract
Efficient inference with transformer-based models remains a challenge, especially in vision tasks like object detection. We analyze the inherent sparsity in the MLP layers of DETR and introduce two methods to exploit it without retraining. First, we propose Static Indicator-Based Sparsification (SIBS), a heuristic method that predicts neuron inactivity based on fixed activation patterns. While simple, SIBS offers limited gains due to the input-dependent nature of sparsity. To address this, we introduce Micro-Gated Sparsification (MGS), a lightweight gating mechanism trained on top of a pretrained DETR. MGS predicts dynamic sparsity using a small linear layer and achieves up to 85 to 95% activation sparsity. Experiments on the COCO dataset show that MGS maintains or even improves performance while significantly reducing computation. Our method offers a practical, input-adaptive approach to sparsification, enabling efficient deployment of pretrained vision transformers without full model retraining.
Problem

Research questions and friction points this paper is trying to address.

Exploits dynamic sparsity in pretrained DETR MLP layers
Reduces computation while maintaining object detection performance
Enables efficient deployment without full model retraining
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic sparsity exploitation in pretrained DETR models
Micro-Gated Sparsification with lightweight gating mechanism
Achieves high activation sparsity without full model retraining
🔎 Similar Papers
No similar papers found.
R
Reza Sedghi
CITEC, Bielefeld University, Germany
A
Anand Subramoney
Department of Computer Science, Royal Holloway, University of London, UK
David Kappel
David Kappel
Bielefeld University
efficient machine learningneuromorphic engineeringcomputational neuroscience