Uni-DAD: Unified Distillation and Adaptation of Diffusion Models for Few-step Few-shot Image Generation

📅 2025-11-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Diffusion models face dual challenges—slow sampling and degraded generation quality—when adapting to new domains; existing two-stage approaches (domain adaptation followed by knowledge distillation) are architecturally complex and compromise either diversity or fidelity. This paper proposes Uni-DAD, the first framework unifying knowledge distillation and domain adaptation into a single-stage training process. Its core innovations include a dual-domain distribution-matching distillation objective, multi-head GAN losses, and a target-domain teacher guidance mechanism, jointly enabling faithful preservation of source-domain knowledge and high-fidelity, realistic generation in the target domain. Evaluated on few-shot image generation and subject-driven personalization tasks, Uni-DAD achieves state-of-the-art performance with only ≤4 sampling steps—significantly improving both generation quality and diversity over prior methods.

Technology Category

Application Category

📝 Abstract
Diffusion models (DMs) produce high-quality images, yet their sampling remains costly when adapted to new domains. Distilled DMs are faster but typically remain confined within their teacher's domain. Thus, fast and high-quality generation for novel domains relies on two-stage training pipelines: Adapt-then-Distill or Distill-then-Adapt. However, both add design complexity and suffer from degraded quality or diversity. We introduce Uni-DAD, a single-stage pipeline that unifies distillation and adaptation of DMs. It couples two signals during training: (i) a dual-domain distribution-matching distillation objective that guides the student toward the distributions of the source teacher and a target teacher, and (ii) a multi-head generative adversarial network (GAN) loss that encourages target realism across multiple feature scales. The source domain distillation preserves diverse source knowledge, while the multi-head GAN stabilizes training and reduces overfitting, especially in few-shot regimes. The inclusion of a target teacher facilitates adaptation to more structurally distant domains. We perform evaluations on a variety of datasets for few-shot image generation (FSIG) and subject-driven personalization (SDP). Uni-DAD delivers higher quality than state-of-the-art (SoTA) adaptation methods even with less than 4 sampling steps, and outperforms two-stage training pipelines in both quality and diversity.
Problem

Research questions and friction points this paper is trying to address.

Reducing computational cost of diffusion models for new domains
Overcoming quality degradation in few-step few-shot generation
Unifying distillation and adaptation in single-stage training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified single-stage pipeline combining distillation and adaptation
Dual-domain distillation matching source and target teacher distributions
Multi-head GAN loss ensures target realism across feature scales
🔎 Similar Papers
No similar papers found.