GIFT: Unlocking Full Potential of Labels in Distilled Dataset at Near-zero Cost

📅 2024-05-23
🏛️ arXiv.org
📈 Citations: 2
Influential: 1
📄 PDF
🤖 AI Summary
To address the underutilization of soft labels from synthetic datasets and the high sensitivity of model training to loss function design in knowledge distillation, this paper proposes GIFT—a zero-overhead, parameter-free, plug-and-play general framework. GIFT enhances soft label exploitation via soft label refinement and a cosine similarity–based loss function, fully unlocking fine-grained inter-class relational information embedded in soft labels, thereby significantly improving model robustness and generalization on distilled data. Our experiments first uncover the critical sensitivity of soft-label loss functions to synthetic-data training. Extensive evaluations across multi-scale benchmarks—including ImageNet-1K—demonstrate that GIFT consistently outperforms state-of-the-art distillation methods. Notably, it achieves up to a 30.8% improvement in cross-optimizer generalization performance, all without incurring any additional computational cost.

Technology Category

Application Category

📝 Abstract
Recent advancements in dataset distillation have demonstrated the significant benefits of employing soft labels generated by pre-trained teacher models. In this paper, we introduce a novel perspective by emphasizing the full utilization of labels. We first conduct a comprehensive comparison of various loss functions for soft label utilization in dataset distillation, revealing that the model trained on the synthetic dataset exhibits high sensitivity to the choice of loss function for soft label utilization. This finding highlights the necessity of a universal loss function for training models on synthetic datasets. Building on these insights, we introduce an extremely simple yet surprisingly effective plug-and-play approach, GIFT, which encompasses soft label refinement and a cosine similarity-based loss function to efficiently leverage full label information. Extensive experiments indicate that GIFT consistently enhances state-of-the-art dataset distillation methods across various dataset scales, without incurring additional computational costs. Importantly, GIFT significantly enhances cross-optimizer generalization, an area previously overlooked. For instance, on ImageNet-1K with IPC = 10, GIFT enhances the state-of-the-art method RDED by 30.8% in cross-optimizer generalization. Our code is available at https://github.com/LINs-lab/GIFT.
Problem

Research questions and friction points this paper is trying to address.

Enhancing dataset distillation with full label utilization
Developing a universal loss function for synthetic datasets
Improving cross-optimizer generalization in dataset distillation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Utilizes soft labels from pre-trained models
Introduces GIFT for label refinement
Employs cosine similarity-based loss function
🔎 Similar Papers
No similar papers found.
X
Xinyi Shang
University College London
P
Peng Sun
Zhejiang University, Westlake University
T
Tao Lin
Westlake University