Accelerate 3D Object Detection Models via Zero-Shot Attention Key Pruning

📅 2025-03-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high computational overhead of Transformer decoders in edge-device 3D object detection—hindering efficient deployment—this paper proposes tgGBC, a zero-shot, retraining-free runtime key pruning method. tgGBC dynamically evaluates key importance layer-wise and progressively prunes redundant keys by jointly leveraging classification scores and attention maps. Its core contributions are: (1) a zero-shot key importance modeling framework that requires no labeled data or fine-tuning; (2) a classification-guided attention weighting mechanism that enhances relevance-aware key selection; (3) 1.99× decoder speedup on the 3D detector ToC3D with <1% mAP degradation—and in certain scenes, marginal accuracy improvement; and (4) successful deployment on resource-constrained edge devices, demonstrating practical viability for real-world applications.

Technology Category

Application Category

📝 Abstract
Query-based methods with dense features have demonstrated remarkable success in 3D object detection tasks. However, the computational demands of these models, particularly with large image sizes and multiple transformer layers, pose significant challenges for efficient running on edge devices. Existing pruning and distillation methods either need retraining or are designed for ViT models, which are hard to migrate to 3D detectors. To address this issue, we propose a zero-shot runtime pruning method for transformer decoders in 3D object detection models. The method, termed tgGBC (trim keys gradually Guided By Classification scores), systematically trims keys in transformer modules based on their importance. We expand the classification score to multiply it with the attention map to get the importance score of each key and then prune certain keys after each transformer layer according to their importance scores. Our method achieves a 1.99x speedup in the transformer decoder of the latest ToC3D model, with only a minimal performance loss of less than 1%. Interestingly, for certain models, our method even enhances their performance. Moreover, we deploy 3D detectors with tgGBC on an edge device, further validating the effectiveness of our method. The code can be found at https://github.com/iseri27/tg_gbc.
Problem

Research questions and friction points this paper is trying to address.

Reduce computational demands of 3D object detection models
Enable efficient running on edge devices without retraining
Prune transformer decoder keys based on importance scores
Innovation

Methods, ideas, or system contributions that make the work stand out.

Zero-shot runtime pruning for 3D detectors
Trim keys based on classification-guided importance
Achieves 1.99x speedup with minimal performance loss
🔎 Similar Papers
No similar papers found.
L
Lizhen Xu
Xi’an Jiaotong University
X
Xiuxiu Bai
Xi’an Jiaotong University
Xiaojun Jia
Xiaojun Jia
Nanyang Technological University
Explainable AIRobust AIEfficient AI
Jianwu Fang
Jianwu Fang
Xi'an Jiaotong University
Scene understandingSafe driving perception and planning
S
Shanmin Pang
Xi’an Jiaotong University