A Diffusion Model Translator for Efficient Image-to-Image Translation

📅 2024-07-30
🏛️ IEEE Transactions on Pattern Analysis and Machine Intelligence
📈 Citations: 5
Influential: 0
📄 PDF
🤖 AI Summary
Existing image-to-image (I2I) translation diffusion models redundantly inject the source image at every denoising step, leading to inefficient inference. This work proposes the lightweight Diffusion Model Translator (DMT), which—through the first theoretical analysis—demonstrates that a single inter-domain distribution transfer suffices for high-fidelity I2I translation. Guided by this insight, DMT introduces a compact translation module that performs distribution alignment only at a critical intermediate timestep. Built upon the DDPM framework, DMT integrates probabilistic distribution shift analysis, an adaptive optimal timestep selection strategy, and a streamlined architecture. Extensive experiments on style transfer, colorization, semantic segmentation map generation, and sketch-to-color tasks show that DMT achieves state-of-the-art performance with significantly faster inference—averaging 3.2× speedup—while simultaneously delivering superior image quality.

Technology Category

Application Category

📝 Abstract
Applying diffusion models to image-to-image translation (I2I) has recently received increasing attention due to its practical applications. Previous attempts inject information from the source image into each denoising step for an iterative refinement, thus resulting in a time-consuming implementation. We propose an efficient method that equips a diffusion model with a lightweight translator, dubbed a Diffusion Model Translator (DMT), to accomplish I2I. Specifically, we first offer theoretical justification that in employing the pioneering DDPM work for the I2I task, it is both feasible and sufficient to transfer the distribution from one domain to another only at some intermediate step. We further observe that the translation performance highly depends on the chosen timestep for domain transfer, and therefore propose a practical strategy to automatically select an appropriate timestep for a given task. We evaluate our approach on a range of I2I applications, including image stylization, image colorization, segmentation to image, and sketch to image, to validate its efficacy and general utility. The comparisons show that our DMT surpasses existing methods in both quality and efficiency. Code is available at https://github.com/THU-LYJ-Lab/dmt.
Problem

Research questions and friction points this paper is trying to address.

Image-to-Image Translation
De-noising
Speed Optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Diffusion Model Translator
Denoising Diffusion Probabilistic Model
Image-to-Image Translation
🔎 Similar Papers
No similar papers found.