CoDA: From Text-to-Image Diffusion Models to Training-Free Dataset Distillation

📅 2025-12-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing data distillation (DD) methods face two key bottlenecks: either requiring expensive diffusion model training on target data—contradicting DD’s core objective of eliminating target-data dependency—or directly leveraging off-the-shelf text-to-image diffusion models, which suffer from distribution shift and performance degradation due to the absence of target-domain semantic priors. This paper proposes Core Distribution Alignment (CoDA), the first DD framework that requires *no target-data training whatsoever*. CoDA identifies the intrinsic core distribution of the target data via density estimation and aligns the generative process of a pre-trained text-to-image diffusion model—without fine-tuning—to this core distribution. Thus, it bridges the gap between generic generative priors and target-specific semantics. On ImageNet-1K, CoDA achieves 60.4% Top-1 accuracy using only 50 distilled images per class, surpassing all state-of-the-art methods reliant on target-data training and setting a new record for this setting.

Technology Category

Application Category

📝 Abstract
Prevailing Dataset Distillation (DD) methods leveraging generative models confront two fundamental limitations. First, despite pioneering the use of diffusion models in DD and delivering impressive performance, the vast majority of approaches paradoxically require a diffusion model pre-trained on the full target dataset, undermining the very purpose of DD and incurring prohibitive training costs. Second, although some methods turn to general text-to-image models without relying on such target-specific training, they suffer from a significant distributional mismatch, as the web-scale priors encapsulated in these foundation models fail to faithfully capture the target-specific semantics, leading to suboptimal performance. To tackle these challenges, we propose Core Distribution Alignment (CoDA), a framework that enables effective DD using only an off-the-shelf text-to-image model. Our key idea is to first identify the"intrinsic core distribution"of the target dataset using a robust density-based discovery mechanism. We then steer the generative process to align the generated samples with this core distribution. By doing so, CoDA effectively bridges the gap between general-purpose generative priors and target semantics, yielding highly representative distilled datasets. Extensive experiments suggest that, without relying on a generative model specifically trained on the target dataset, CoDA achieves performance on par with or even superior to previous methods with such reliance across all benchmarks, including ImageNet-1K and its subsets. Notably, it establishes a new state-of-the-art accuracy of 60.4% at the 50-images-per-class (IPC) setup on ImageNet-1K. Our code is available on the project webpage: https://github.com/zzzlt422/CoDA
Problem

Research questions and friction points this paper is trying to address.

Eliminates need for target-specific pre-trained diffusion models in dataset distillation
Addresses distribution mismatch between web-scale priors and target dataset semantics
Enables training-free dataset distillation using off-the-shelf text-to-image models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses off-the-shelf text-to-image model for distillation
Identifies intrinsic core distribution via density-based mechanism
Aligns generated samples with core distribution to bridge gap
🔎 Similar Papers
No similar papers found.