π€ AI Summary
Diffusion models for image compression suffer from high inference latency and substantial computational overhead due to their multi-step denoising process, hindering practical deployment. This work proposes the first single-step diffusion-based image compression method, which requires only one denoising operation during decoding. To enhance perceptual quality, the approach introduces a discriminator trained on compact feature representations for perception-oriented adversarial learning. The proposed method achieves rate-distortion performance comparable to existing diffusion-based compression techniques while accelerating inference by a factor of 46, thereby striking a significantly improved balance between reconstruction fidelity and computational efficiency.
π Abstract
Diffusion-based image compression methods have achieved notable progress, delivering high perceptual quality at low bitrates. However, their practical deployment is hindered by significant inference latency and heavy computational overhead, primarily due to the large number of denoising steps required during decoding. To address this problem, we propose a diffusion-based image compression method that requires only a single-step diffusion process, significantly improving inference speed. To enhance the perceptual quality of reconstructed images, we introduce a discriminator that operates on compact feature representations instead of raw pixels, leveraging the fact that features better capture high-level texture and structural details. Experimental results show that our method delivers comparable compression performance while offering a 46$\times$ faster inference speed compared to recent diffusion-based approaches. The source code and models are available at https://github.com/cheesejiang/OSDiff.