Diffusion Dataset Condensation: Training Your Diffusion Model Faster with Less Data

📅 2025-07-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high computational cost and large data requirements of diffusion model training, this paper proposes D2C—the first dataset compression framework tailored for diffusion models. D2C employs a two-stage design: (1) subset selection guided by diffusion difficulty scoring and interval-based sampling; and (2) synthetic sample augmentation integrating semantic and visual representation embeddings while strengthening conditional signal fidelity. On the SiT-XL/2 architecture, D2C achieves an FID of 4.3 within 40k training steps using only 0.8% of the original dataset’s synthetic samples, yielding a 100× speedup in training. As the first systematic application of data distillation to diffusion model training, D2C significantly reduces both computational overhead and data dependency—while preserving or even improving generation quality. This work establishes a novel paradigm for efficient, scalable diffusion model training.

Technology Category

Application Category

📝 Abstract
Diffusion models have achieved remarkable success in various generative tasks, but training them remains highly resource-intensive, often requiring millions of images and many days of GPU computation. From a data-centric perspective addressing this limitation, we study diffusion dataset condensation as a new and challenging problem setting. The goal is to construct a "synthetic" sub-dataset with significantly fewer samples than the original dataset, enabling high-quality diffusion model training with greatly reduced cost. To the best of our knowledge, we are the first to formally investigate dataset condensation for diffusion models, whereas prior work focused on training discriminative models. To tackle this new challenge, we propose a novel Diffusion Dataset Condensation (D2C) framework, which consists of two phases: Select and Attach. The Select phase identifies a compact and diverse subset using a diffusion difficulty score and interval sampling. The Attach phase enhances the selected subset by attaching rich semantic and visual representations to strengthen the conditional signals. Extensive experiments across various dataset sizes, model architectures, and resolutions show that our D2C framework enables significantly faster diffusion model training with dramatically fewer data, while preserving high visual quality. Notably, for the SiT-XL/2 architecture, D2C achieves a 100x training speed-up, reaching a FID score of 4.3 in just 40k steps using only 0.8% of the training data.
Problem

Research questions and friction points this paper is trying to address.

Reducing resource-intensive diffusion model training
Creating synthetic sub-datasets for efficient training
Preserving high visual quality with fewer data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses diffusion difficulty score selection
Attaches rich semantic visual representations
Achieves 100x faster training speed
🔎 Similar Papers
No similar papers found.
R
Rui Huang
xLeaF Lab, Hong Kong University of Science and Technology (Guangzhou)
Shitong Shao
Shitong Shao
The Hong Kong University of Science and Technology (Guangzhou)
Efficient Computer VisionDiffusion ModelsInference-heavy Algorithms
Zikai Zhou
Zikai Zhou
Hong Kong University of Science and Technology(Guang Zhou)
Data-centric AIDiffusion ModelsAutoregressive ModelAIGC
P
Pukun Zhao
xLeaF Lab, Hong Kong University of Science and Technology (Guangzhou)
H
Hangyu Guo
Harbin Insititute of Technology, Shenzhen
T
Tian Ye
xLeaF Lab, Hong Kong University of Science and Technology (Guangzhou)
Lichen Bai
Lichen Bai
HKUST (GZ)
Generative AI
S
Shuo Yang
Harbin Insititute of Technology, Shenzhen
Zeke Xie
Zeke Xie
Assistant Professor, The Hong Kong University of Science and Technology (Guangzhou)/ PI, xLeaF Lab
Generative AIData-centric AILarge ModelsDeep Learning TheoryOptimization