🤖 AI Summary
To address scalability bottlenecks in large-scale dataset distillation—including prohibitive computational cost, high memory consumption, synthetic image homogenization, and poor generalization—this paper proposes Curricular Distillation, a novel curriculum-based framework. It progressively synthesizes training images according to sample difficulty, introduces a first-of-its-kind curriculum-aware evaluation mechanism to mitigate homogenization, and incorporates an adversarial optimization module to enhance image representativeness and cross-architecture robustness. The method integrates curriculum learning, gradient-matching distillation, multi-scale evaluation, and adversarial training. Extensive experiments demonstrate state-of-the-art performance: +11.1%, +9.0%, and +7.3% top-1 accuracy gains on Tiny-ImageNet, ImageNet-1K, and ImageNet-21K, respectively, significantly outperforming prior approaches and establishing a new benchmark for large-scale dataset distillation.
📝 Abstract
Most dataset distillation methods struggle to accommodate large-scale datasets due to their substantial computational and memory requirements. In this paper, we present a curriculum-based dataset distillation framework designed to harmonize scalability with efficiency. This framework strategically distills synthetic images, adhering to a curriculum that transitions from simple to complex. By incorporating curriculum evaluation, we address the issue of previous methods generating images that tend to be homogeneous and simplistic, doing so at a manageable computational cost. Furthermore, we introduce adversarial optimization towards synthetic images to further improve their representativeness and safeguard against their overfitting to the neural network involved in distilling. This enhances the generalization capability of the distilled images across various neural network architectures and also increases their robustness to noise. Extensive experiments demonstrate that our framework sets new benchmarks in large-scale dataset distillation, achieving substantial improvements of 11.1% on Tiny-ImageNet, 9.0% on ImageNet-1K, and 7.3% on ImageNet-21K. The source code will be released to the community.