Improving Diversity in Black-box Few-shot Knowledge Distillation

📅 2026-04-28
📈 Citations: 0
Influential: 0
📄 PDF

career value

230K/year
🤖 AI Summary
This work addresses the challenge of black-box few-shot knowledge distillation, where extremely limited training data and inaccessibility to the teacher model’s internals lead to insufficient diversity in synthesized images, thereby constraining student model performance. To overcome this limitation, the authors propose a novel generative adversarial network training mechanism that dynamically enhances synthetic data diversity by integrating teacher-guided adaptive high-confidence sample selection with online adversarial learning. This approach effectively mitigates the bottleneck of knowledge transfer under black-box few-shot conditions. Extensive experiments across seven image datasets demonstrate that the proposed method significantly improves student model accuracy, achieving state-of-the-art performance.
📝 Abstract
Knowledge distillation (KD) is a well-known technique to effectively compress a large network (teacher) to a smaller network (student) with little sacrifice in performance. However, most KD methods require a large training set and internal access to the teacher, which are rarely available due to various restrictions. These challenges have originated a more practical setting known as black-box few-shot KD, where the student is trained with few images and a black-box teacher. Recent approaches typically generate additional synthetic images but lack an active strategy to promote their diversity, a crucial factor for student learning. To address these problems, we propose a novel training scheme for generative adversarial networks, where we adaptively select high-confidence images under the teacher's supervision and introduce them to the adversarial learning on-the-fly. Our approach helps expand and improve the diversity of the distillation set, significantly boosting student accuracy. Through extensive experiments, we achieve state-of-the-art results among other few-shot KD methods on seven image datasets. The code is available at https://github.com/votrinhan88/divbfkd.
Problem

Research questions and friction points this paper is trying to address.

black-box knowledge distillation
few-shot learning
synthetic image diversity
model compression
student-teacher learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

black-box few-shot knowledge distillation
diversity enhancement
generative adversarial networks
adaptive image selection
synthetic data generation
🔎 Similar Papers