Image Diffusion Preview with Consistency Solver

📅 2025-12-15

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

Diffusion models suffer from slow inference, hindering interactive applications. This paper introduces “Diffusion Preview,” a novel paradigm wherein only a minimal number of steps generate an assessable, high-fidelity preview image; final refinement proceeds only upon user confirmation. Our core contribution is ConsistencySolver—the first lightweight, trainable, high-order ODE solver based on linear multistep methods—jointly optimized via reinforcement learning to maximize both preview quality and consistency between preview and final output. Crucially, our method requires no model retraining or knowledge distillation and is compatible with any pre-trained diffusion model. Experiments demonstrate that, compared to Multistep DPM-Solver, our approach reduces sampling steps by 47% while preserving FID performance; it significantly outperforms distillation-based baselines; and it cuts total user-side interaction time by nearly 50%, without compromising final output quality.

Technology Category

Application Category

📝 Abstract

The slow inference process of image diffusion models significantly degrades interactive user experiences. To address this, we introduce Diffusion Preview, a novel paradigm employing rapid, low-step sampling to generate preliminary outputs for user evaluation, deferring full-step refinement until the preview is deemed satisfactory. Existing acceleration methods, including training-free solvers and post-training distillation, struggle to deliver high-quality previews or ensure consistency between previews and final outputs. We propose ConsistencySolver derived from general linear multistep methods, a lightweight, trainable high-order solver optimized via Reinforcement Learning, that enhances preview quality and consistency. Experimental results demonstrate that ConsistencySolver significantly improves generation quality and consistency in low-step scenarios, making it ideal for efficient preview-and-refine workflows. Notably, it achieves FID scores on-par with Multistep DPM-Solver using 47% fewer steps, while outperforming distillation baselines. Furthermore, user studies indicate our approach reduces overall user interaction time by nearly 50% while maintaining generation quality. Code is available at https://github.com/G-U-N/consolver.

Problem

Research questions and friction points this paper is trying to address.

Accelerates image diffusion model inference for interactive previews

Improves quality and consistency between low-step previews and final outputs

Reduces user interaction time while maintaining generation quality

Innovation

Methods, ideas, or system contributions that make the work stand out.

ConsistencySolver uses high-order linear multistep methods

It is optimized via Reinforcement Learning for preview quality

Enables efficient preview-and-refine workflows with fewer steps

🔎 Similar Papers

SpotDiffusion: A Fast Approach For Seamless Panorama Generation Over Time