Accelerate 3D Object Detection Models via Zero-Shot Attention Key Pruning

📅 2025-03-11

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

To address the high computational overhead of Transformer decoders in edge-device 3D object detection—hindering efficient deployment—this paper proposes tgGBC, a zero-shot, retraining-free runtime key pruning method. tgGBC dynamically evaluates key importance layer-wise and progressively prunes redundant keys by jointly leveraging classification scores and attention maps. Its core contributions are: (1) a zero-shot key importance modeling framework that requires no labeled data or fine-tuning; (2) a classification-guided attention weighting mechanism that enhances relevance-aware key selection; (3) 1.99× decoder speedup on the 3D detector ToC3D with <1% mAP degradation—and in certain scenes, marginal accuracy improvement; and (4) successful deployment on resource-constrained edge devices, demonstrating practical viability for real-world applications.

Technology Category

Application Category

📝 Abstract

Query-based methods with dense features have demonstrated remarkable success in 3D object detection tasks. However, the computational demands of these models, particularly with large image sizes and multiple transformer layers, pose significant challenges for efficient running on edge devices. Existing pruning and distillation methods either need retraining or are designed for ViT models, which are hard to migrate to 3D detectors. To address this issue, we propose a zero-shot runtime pruning method for transformer decoders in 3D object detection models. The method, termed tgGBC (trim keys gradually Guided By Classification scores), systematically trims keys in transformer modules based on their importance. We expand the classification score to multiply it with the attention map to get the importance score of each key and then prune certain keys after each transformer layer according to their importance scores. Our method achieves a 1.99x speedup in the transformer decoder of the latest ToC3D model, with only a minimal performance loss of less than 1%. Interestingly, for certain models, our method even enhances their performance. Moreover, we deploy 3D detectors with tgGBC on an edge device, further validating the effectiveness of our method. The code can be found at https://github.com/iseri27/tg_gbc.

Problem

Research questions and friction points this paper is trying to address.

Reduce computational demands of 3D object detection models

Enable efficient running on edge devices without retraining

Prune transformer decoder keys based on importance scores

Innovation

Methods, ideas, or system contributions that make the work stand out.

Zero-shot runtime pruning for 3D detectors

Trim keys based on classification-guided importance

Achieves 1.99x speedup with minimal performance loss

🔎 Similar Papers

No similar papers found.