🤖 AI Summary
Diffusion bridge models for image-to-image translation face practical limitations due to slow sampling and high computational costs from numerous function evaluations (NFEs). This work proposes DBMSolver, a training-free, efficient sampler that introduces exponential integrators into diffusion bridge sampling for the first time. By exploiting the semi-linear structure of the underlying SDE/ODE, DBMSolver enables both first- and second-order numerical integration without altering the pre-trained model. The method substantially improves sampling efficiency and generation quality, achieving a 53% reduction in FID on the DIODE dataset with only 20 NFEs. It establishes state-of-the-art trade-offs between efficiency and fidelity across diverse tasks and resolutions, enabling practical deployment.
📝 Abstract
Diffusion-based image-to-image (I2I) translation excels in high-fidelity generation but suffers from slow sampling in state-of-the-art Diffusion Bridge Models (DBMs), often requiring dozens of function evaluations (NFEs). We introduce DBMSolver, a training-free sampler that exploits the semi-linear structure of DBM's underlying SDE and ODE via exponential integrators, yielding highly-efficient 1st- and 2nd-order solutions. This reduces NFEs by up to 5x while boosting quality (e.g., FID drops 53% on DIODE at 20 NFEs vs. 2nd-order baseline). Experiments on inpainting, stylization, and semantics-to-image tasks across resolutions up to 256x256 show DBMSolver sets new SOTA efficiency-quality tradeoffs, enabling real-world applicability. Our code is publicly available at https://github.com/snumprlab/dbmsolver.