Attention-guided Feature Distillation for Semantic Segmentation

📅 2024-03-08

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

To address the low efficiency and structural redundancy in knowledge distillation for lightweight semantic segmentation models, this paper proposes an attention-guided feature distillation method. The core innovation lies in the first integration of the Convolutional Block Attention Module (CBAM) into the distillation pipeline, enabling joint modeling of channel- and spatial-wise attention to achieve fine-grained alignment between teacher and student feature maps. Crucially, high-level semantic knowledge is efficiently transferred using only the mean squared error (MSE) loss, significantly reducing architectural and training complexity. Extensive experiments on four benchmark datasets—PascalVOC 2012, Cityscapes, COCO, and CamVid—demonstrate that the student model achieves superior mean Intersection-over-Union (mIoU) performance over existing distillation approaches, establishing new state-of-the-art results. This validates the proposed method’s outstanding balance between accuracy and computational efficiency.

Technology Category

Application Category

📝 Abstract

Deep learning models have achieved significant results across various computer vision tasks. However, due to the large number of parameters in these models, deploying them in real-time scenarios is a critical challenge, specifically in dense prediction tasks such as semantic segmentation. Knowledge distillation has emerged as a successful technique for addressing this problem by transferring knowledge from a cumbersome model (teacher) to a lighter model (student). In contrast to existing complex methodologies commonly employed for distilling knowledge from a teacher to a student, this paper showcases the efficacy of a simple yet powerful method for utilizing refined feature maps to transfer attention. The proposed method has proven to be effective in distilling rich information, outperforming existing methods in semantic segmentation as a dense prediction task. The proposed Attention-guided Feature Distillation (AttnFD) method, employs the Convolutional Block Attention Module (CBAM), which refines feature maps by taking into account both channel-specific and spatial information content. Simply using the Mean Squared Error (MSE) loss function between the refined feature maps of the teacher and the student, AttnFD demonstrates outstanding performance in semantic segmentation, achieving state-of-the-art results in terms of improving the mean Intersection over Union (mIoU) of the student network on the PascalVoc 2012, Cityscapes, COCO, and CamVid datasets.

Problem

Research questions and friction points this paper is trying to address.

Deploying large deep learning models in real-time semantic segmentation

Simplifying knowledge distillation for efficient teacher-to-student learning

Improving semantic segmentation accuracy using attention-guided feature distillation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Attention-guided feature distillation for segmentation

Uses CBAM for channel and spatial attention

Simple MSE loss between teacher and student

🔎 Similar Papers

Towards Complementary Knowledge Distillation for Efficient Dense Image Prediction