GIFT: Unlocking Full Potential of Labels in Distilled Dataset at Near-zero Cost

📅 2024-05-23

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 1

career value

161K/year

🤖 AI Summary

To address the underutilization of soft labels from synthetic datasets and the high sensitivity of model training to loss function design in knowledge distillation, this paper proposes GIFT—a zero-overhead, parameter-free, plug-and-play general framework. GIFT enhances soft label exploitation via soft label refinement and a cosine similarity–based loss function, fully unlocking fine-grained inter-class relational information embedded in soft labels, thereby significantly improving model robustness and generalization on distilled data. Our experiments first uncover the critical sensitivity of soft-label loss functions to synthetic-data training. Extensive evaluations across multi-scale benchmarks—including ImageNet-1K—demonstrate that GIFT consistently outperforms state-of-the-art distillation methods. Notably, it achieves up to a 30.8% improvement in cross-optimizer generalization performance, all without incurring any additional computational cost.

Technology Category

Application Category

📝 Abstract

Recent advancements in dataset distillation have demonstrated the significant benefits of employing soft labels generated by pre-trained teacher models. In this paper, we introduce a novel perspective by emphasizing the full utilization of labels. We first conduct a comprehensive comparison of various loss functions for soft label utilization in dataset distillation, revealing that the model trained on the synthetic dataset exhibits high sensitivity to the choice of loss function for soft label utilization. This finding highlights the necessity of a universal loss function for training models on synthetic datasets. Building on these insights, we introduce an extremely simple yet surprisingly effective plug-and-play approach, GIFT, which encompasses soft label refinement and a cosine similarity-based loss function to efficiently leverage full label information. Extensive experiments indicate that GIFT consistently enhances state-of-the-art dataset distillation methods across various dataset scales, without incurring additional computational costs. Importantly, GIFT significantly enhances cross-optimizer generalization, an area previously overlooked. For instance, on ImageNet-1K with IPC = 10, GIFT enhances the state-of-the-art method RDED by 30.8% in cross-optimizer generalization. Our code is available at https://github.com/LINs-lab/GIFT.

Problem

Research questions and friction points this paper is trying to address.

Enhancing dataset distillation with full label utilization

Developing a universal loss function for synthetic datasets

Improving cross-optimizer generalization in dataset distillation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Utilizes soft labels from pre-trained models

Introduces GIFT for label refinement

Employs cosine similarity-based loss function

🔎 Similar Papers

No similar papers found.