LumiNet: The Bright Side of Perceptual Knowledge Distillation

📅 2023-10-05

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 1

career value

156K/year

🤖 AI Summary

To address the inferior performance of logit distillation compared to feature-based distillation, this paper proposes a perception-driven logit calibration mechanism. It introduces the concept of “perception” into knowledge distillation for the first time, reconstructing logits by modeling intra-batch semantic relationships among samples to mitigate model overconfidence and enhance knowledge transfer fidelity. The method operates solely on logits—without requiring teacher feature maps—yet achieves state-of-the-art performance within pure-logit distillation frameworks. On ImageNet, it improves ResNet-18 and MobileNetV2 by 1.5% and 2.05% over standard Knowledge Distillation (KD), respectively; it also significantly outperforms advanced feature distillation methods on CIFAR-100 and MS COCO. Key contributions include: (i) a novel perception-aware calibration paradigm; (ii) a batch-level logit relational modeling mechanism; and (iii) empirical validation that high-performance distillation is feasible without accessing intermediate features.

📝 Abstract

In knowledge distillation literature, feature-based methods have dominated due to their ability to effectively tap into extensive teacher models. In contrast, logit-based approaches, which aim to distill `dark knowledge' from teachers, typically exhibit inferior performance compared to feature-based methods. To bridge this gap, we present LumiNet, a novel knowledge distillation algorithm designed to enhance logit-based distillation. We introduce the concept of 'perception', aiming to calibrate logits based on the model's representation capability. This concept addresses overconfidence issues in logit-based distillation method while also introducing a novel method to distill knowledge from the teacher. It reconstructs the logits of a sample/instances by considering relationships with other samples in the batch. LumiNet excels on benchmarks like CIFAR-100, ImageNet, and MSCOCO, outperforming leading feature-based methods, e.g., compared to KD with ResNet18 and MobileNetV2 on ImageNet, it shows improvements of 1.5% and 2.05%, respectively.

Problem

Research questions and friction points this paper is trying to address.

Enhances logit-based knowledge distillation performance

Addresses overconfidence in logit-based distillation methods

Improves distillation by calibrating logits with sample relationships

Innovation

Methods, ideas, or system contributions that make the work stand out.

Perception-driven logit calibration for distillation

Reconstructs logits using batch sample relationships

Outperforms feature-based methods on benchmarks

🔎 Similar Papers

No similar papers found.