Decouple-Then-Merge: Finetune Diffusion Models as Multi-Task Learning

📅 2024-10-09
📈 Citations: 0
Influential: 0
📄 PDF

career value

184K/year
🤖 AI Summary
Diffusion models suffer from gradient interference among denoising tasks across timesteps due to parameter sharing, limiting generative performance. To address this, we propose DeMe—a novel framework introducing the paradigm of “timestep-decoupled fine-tuning + reversible parameter fusion.” First, dedicated submodels are learned for key timesteps with isolated gradient updates; then, a knowledge-distillation-guided reversible fusion mechanism integrates them into a single, efficient model without inference overhead. This effectively mitigates inter-task interference while preserving computational efficiency. Evaluated on six benchmarks—including COCO30K and ImageNet1K—DeMe consistently improves FID and LPIPS across both Stable Diffusion and DDPM backbones, demonstrating strong generalizability and effectiveness.

Technology Category

Application Category

📝 Abstract
Diffusion models are trained by learning a sequence of models that reverse each step of noise corruption. Typically, the model parameters are fully shared across multiple timesteps to enhance training efficiency. However, since the denoising tasks differ at each timestep, the gradients computed at different timesteps may conflict, potentially degrading the overall performance of image generation. To solve this issue, this work proposes a extbf{De}couple-then- extbf{Me}rge ( extbf{DeMe}) framework, which begins with a pretrained model and finetunes separate models tailored to specific timesteps. We introduce several improved techniques during the finetuning stage to promote effective knowledge sharing while minimizing training interference across timesteps. Finally, after finetuning, these separate models can be merged into a single model in the parameter space, ensuring efficient and practical inference. Experimental results show significant generation quality improvements upon 6 benchmarks including Stable Diffusion on COCO30K, ImageNet1K, PartiPrompts, and DDPM on LSUN Church, LSUN Bedroom, and CIFAR10. Code is available at href{https://github.com/MqLeet/DeMe}{GitHub}.
Problem

Research questions and friction points this paper is trying to address.

Address gradient conflicts in diffusion models across timesteps.
Propose Decouple-then-Merge framework for multi-task learning.
Improve image generation quality on multiple benchmarks.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decouple-then-Merge framework for diffusion models
Finetune separate models for specific timesteps
Merge models post-finetuning for efficient inference
🔎 Similar Papers
No similar papers found.