DiffO: Single-step Diffusion for Image Compression at Ultra-Low Bitrates

📅 2025-06-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing traditional and generative image codecs suffer from severe quality collapse and excessive decoding latency under ultra-low bitrates (<0.1 bpp). Method: This paper proposes the first one-step diffusion-based image compression framework, integrating vector quantization (VQ), single-step denoising diffusion, and rate-distortion joint optimization. Key innovations include (1) VQ-based residual latent space modeling to enhance geometric structure fidelity, and (2) a bitrate-adaptive noise modulation mechanism for dynamic high-frequency detail reconstruction. Contribution/Results: Extensive experiments demonstrate that our method significantly outperforms state-of-the-art codecs (e.g., HiFiC, DCVC) at ultra-low bitrates, achieving substantial gains in PSNR, MS-SSIM, and perceptual quality (e.g., LPIPS, FID). Moreover, decoding speed is accelerated by approximately 50× compared to iterative diffusion-based methods, enabling—for the first time—simultaneous high-fidelity reconstruction and real-time decoding under ultra-low-bitrate constraints.

Technology Category

Application Category

📝 Abstract
Although image compression is fundamental to visual data processing and has inspired numerous standard and learned codecs, these methods still suffer severe quality degradation at extremely low bits per pixel. While recent diffusion based models provided enhanced generative performance at low bitrates, they still yields limited perceptual quality and prohibitive decoding latency due to multiple denoising steps. In this paper, we propose the first single step diffusion model for image compression (DiffO) that delivers high perceptual quality and fast decoding at ultra low bitrates. DiffO achieves these goals by coupling two key innovations: (i) VQ Residual training, which factorizes a structural base code and a learned residual in latent space, capturing both global geometry and high frequency details; and (ii) rate adaptive noise modulation, which tunes denoising strength on the fly to match the desired bitrate. Extensive experiments show that DiffO surpasses state of the art compression performance while improving decoding speed by about 50x compared to prior diffusion-based methods, greatly improving the practicality of generative codecs. The code will be available at https://github.com/Freemasti/DiffO.
Problem

Research questions and friction points this paper is trying to address.

Improves image compression quality at ultra-low bitrates
Reduces decoding latency in diffusion-based compression models
Enhances perceptual quality with single-step diffusion process
Innovation

Methods, ideas, or system contributions that make the work stand out.

Single-step diffusion model for compression
VQ Residual training for global and detail capture
Rate adaptive noise modulation for bitrate matching
🔎 Similar Papers
No similar papers found.