CustomKD: Customizing Large Vision Foundation for Edge Model Improvement via Knowledge Distillation

📅 2025-03-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large Vision Foundation Models (LVFMs) and lightweight edge models (e.g., MobileNetV3) exhibit significant architectural and capacity disparities, hindering effective knowledge distillation—especially in label-scarce or unsupervised settings. To address this, we propose CustomKD, a knowledge distillation framework customized for edge models. Its core innovation is a label-free teacher–student feature-space alignment mechanism that adapts the generalizable representations of LVFMs (e.g., DINOv2, CLIP) to student models. To our knowledge, CustomKD is the first distillation framework enabling customized, architecture-aware transfer of LVFM knowledge to resource-constrained edge models. Extensive experiments demonstrate state-of-the-art performance across diverse low-label regimes: unsupervised domain adaptation on OfficeHome and DomainNet, semi-supervised learning on CIFAR-100 with only 400 labeled samples, and ImageNet with merely 1% labeled data.

Technology Category

Application Category

📝 Abstract
We propose a novel knowledge distillation approach, CustomKD, that effectively leverages large vision foundation models (LVFMs) to enhance the performance of edge models (e.g., MobileNetV3). Despite recent advancements in LVFMs, such as DINOv2 and CLIP, their potential in knowledge distillation for enhancing edge models remains underexplored. While knowledge distillation is a promising approach for improving the performance of edge models, the discrepancy in model capacities and heterogeneous architectures between LVFMs and edge models poses a significant challenge. Our observation indicates that although utilizing larger backbones (e.g., ViT-S to ViT-L) in teacher models improves their downstream task performances, the knowledge distillation from the large teacher models fails to bring as much performance gain for student models as for teacher models due to the large model discrepancy. Our simple yet effective CustomKD customizes the well-generalized features inherent in LVFMs to a given student model in order to reduce model discrepancies. Specifically, beyond providing well-generalized original knowledge from teachers, CustomKD aligns the features of teachers to those of students, making it easy for students to understand and overcome the large model discrepancy overall. CustomKD significantly improves the performances of edge models in scenarios with unlabeled data such as unsupervised domain adaptation (e.g., OfficeHome and DomainNet) and semi-supervised learning (e.g., CIFAR-100 with 400 labeled samples and ImageNet with 1% labeled samples), achieving the new state-of-the-art performances.
Problem

Research questions and friction points this paper is trying to address.

Reducing model discrepancy between large vision foundation models and edge models
Enhancing edge model performance via customized knowledge distillation
Improving unsupervised and semi-supervised learning with unlabeled data
Innovation

Methods, ideas, or system contributions that make the work stand out.

CustomKD customizes LVFM features for edge models
Aligns teacher and student features to reduce discrepancy
Improves edge model performance with unlabeled data
🔎 Similar Papers
No similar papers found.