🤖 AI Summary
Diffusion models suffer from high deployment costs due to their large parameter count and extensive sampling steps, while existing knowledge distillation approaches fail to generalize to unseen concepts during training. To address this, we propose a novel random conditional distillation framework that operates without access to the original training images. Our method introduces a pioneering random text-condition pairing mechanism, enabling image-agnostic knowledge transfer, and is the first to support free exploration in the conditional space—eliminating reliance on condition-specific training data. Key components include random text injection, multi-step noise-matching distillation, conditional-aware feature alignment, and progressive denoising transfer. Experiments demonstrate that the distilled student model achieves a 22% reduction in FID, requires 40% fewer sampling steps, and successfully generates entirely novel concepts absent from the training set—significantly improving data efficiency and cross-concept generalization capability.
📝 Abstract
Diffusion models generate high-quality images through progressive denoising but are computationally intensive due to large model sizes and repeated sampling. Knowledge distillation, which transfers knowledge from a complex teacher to a simpler student model, has been widely studied in recognition tasks, particularly for transferring concepts unseen during student training. However, its application to diffusion models remains underexplored, especially in enabling student models to generate concepts not covered by the training images. In this work, we propose Random Conditioning, a novel approach that pairs noised images with randomly selected text conditions to enable efficient, image-free knowledge distillation. By leveraging this technique, we show that the student can generate concepts unseen in the training images. When applied to conditional diffusion model distillation, our method allows the student to explore the condition space without generating condition-specific images, resulting in notable improvements in both generation quality and efficiency. This promotes resource-efficient deployment of generative diffusion models, broadening their accessibility for both research and real-world applications. Code, models, and datasets are available at https://dohyun-as.github.io/Random-Conditioning .