Intra-class patch swap for self-distillation

📅 2025-05-01
🏛️ Neurocomputing
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional knowledge distillation relies on pre-trained large teacher models, incurring substantial storage overhead, high training costs, and ambiguity in teacher selection; existing teacher-free distillation methods often require architectural modifications or complex training procedures. This paper proposes a lightweight, general-purpose self-distillation framework that employs only a single student network—introducing no auxiliary modules, structural changes, or additional learnable parameters. Its core innovation is an intra-class image patch swapping mechanism: under class-label guidance, random patches are cropped and exchanged across samples within the same class—first introducing intra-class local structural rearrangement into self-distillation to jointly optimize implicit knowledge transfer and feature disentanglement. Integrated with consistency regularization and feature-level self-distillation loss, our method improves ResNet-34 Top-1 accuracy by 1.8% on CIFAR-100 and 1.3% on ImageNet-1K, significantly outperforming standard self-distillation baselines.

Technology Category

Application Category

Problem

Research questions and friction points this paper is trying to address.

Eliminates need for pre-trained teacher networks in knowledge distillation
Simplifies self-distillation without architectural changes or extra parameters
Improves model performance across multiple vision tasks efficiently
Innovation

Methods, ideas, or system contributions that make the work stand out.

Intra-class patch swap augmentation for self-distillation
Single student network without auxiliary components
Model-agnostic and easy-to-implement augmentation function
🔎 Similar Papers
No similar papers found.
H
Hongjun Choi
Lawrence Livermore National Laboratory, Livermore, 94550, CA, USA
Eunyeong Jeon
Eunyeong Jeon
Department of Computer Science and Engineering, Seoul National University of Science and Technology, Seoul, 01811, South Korea
A
A. Shukla
Department of Computer Science and Engineering, University of Nevada, Reno, 89557, NV, USA
Pavan Turaga
Pavan Turaga
Geometric Media Lab, Arizona State University
Computer VisionMachine LearningGeometryTopologyWearables