Learnable Instance Attention Filtering for Adaptive Detector Distillation

πŸ“… 2026-03-27
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing knowledge distillation methods for object detection typically treat all instances equally and employ heuristic or teacher-only attention filtering mechanisms, overlooking the student model’s learning dynamics and instance-level variations. This work proposes a learnable instance-aware attention filtering framework that, for the first time, integrates the student model’s dynamic learning state into an instance-level attention mechanism. By introducing a trainable instance selector, the method adaptively reweights the importance of each instance, enabling end-to-end adaptive distillation. This approach departs from conventional static or teacher-dominated filtering paradigms and achieves significant performance gains on KITTI and COCO benchmarks: a GFL ResNet-50 student model improves by 2% mAP without additional computational overhead, outperforming current state-of-the-art methods.
πŸ“ Abstract
As deep vision models grow increasingly complex to achieve higher performance, deployment efficiency has become a critical concern. Knowledge distillation (KD) mitigates this issue by transferring knowledge from large teacher models to compact student models. While many feature-based KD methods rely on spatial filtering to guide distillation, they typically treat all object instances uniformly, ignoring instance-level variability. Moreover, existing attention filtering mechanisms are typically heuristic or teacher-driven, rather than learned with the student. To address these limitations, we propose Learnable Instance Attention Filtering for Adaptive Detector Distillation (LIAF-KD), a novel framework that introduces learnable instance selectors to dynamically evaluate and reweight instance importance during distillation. Notably, the student contributes to this process based on its evolving learning state. Experiments on the KITTI and COCO datasets demonstrate consistent improvements, with a 2% gain on a GFL ResNet-50 student without added complexity, outperforming state-of-the-art methods.
Problem

Research questions and friction points this paper is trying to address.

knowledge distillation
object detection
instance-level variability
attention filtering
student-teacher learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

learnable instance attention
adaptive distillation
knowledge distillation
object detection
instance-level filtering
πŸ”Ž Similar Papers
C
Chen Liu
Dept. of CS, University of Alabama at Birmingham
Qizhen Lan
Qizhen Lan
UTHealth Houston
Computer VisionKnowledge DistillationObject detectionMedical ImagingStatistical Modeling
Z
Zhicheng Ding
Dept. of CS, Bowling Green State University
X
Xinyu Chu
Dept. of CS, Bowling Green State University
Qing Tian
Qing Tian
University of Alabama at Birmingham
Computer VisionMachine LearningDeep LearningAutonomous Driving