π€ AI Summary
Existing knowledge distillation methods for object detection typically treat all instances equally and employ heuristic or teacher-only attention filtering mechanisms, overlooking the student modelβs learning dynamics and instance-level variations. This work proposes a learnable instance-aware attention filtering framework that, for the first time, integrates the student modelβs dynamic learning state into an instance-level attention mechanism. By introducing a trainable instance selector, the method adaptively reweights the importance of each instance, enabling end-to-end adaptive distillation. This approach departs from conventional static or teacher-dominated filtering paradigms and achieves significant performance gains on KITTI and COCO benchmarks: a GFL ResNet-50 student model improves by 2% mAP without additional computational overhead, outperforming current state-of-the-art methods.
π Abstract
As deep vision models grow increasingly complex to achieve higher performance, deployment efficiency has become a critical concern. Knowledge distillation (KD) mitigates this issue by transferring knowledge from large teacher models to compact student models. While many feature-based KD methods rely on spatial filtering to guide distillation, they typically treat all object instances uniformly, ignoring instance-level variability. Moreover, existing attention filtering mechanisms are typically heuristic or teacher-driven, rather than learned with the student. To address these limitations, we propose Learnable Instance Attention Filtering for Adaptive Detector Distillation (LIAF-KD), a novel framework that introduces learnable instance selectors to dynamically evaluate and reweight instance importance during distillation. Notably, the student contributes to this process based on its evolving learning state. Experiments on the KITTI and COCO datasets demonstrate consistent improvements, with a 2% gain on a GFL ResNet-50 student without added complexity, outperforming state-of-the-art methods.