OSCAR: One-Step Diffusion Codec Across Multiple Bit-rates

πŸ“… 2025-05-22
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing diffusion-based image compression methods suffer from two key limitations: high computational cost due to iterative sampling and the need for separate model training per bit-rate, resulting in prohibitive training and storage overhead. This paper proposes OSCARβ€”the first multi-bit-rate diffusion image compression framework enabling single-step decoding. Its core innovation lies in modeling compressed latent variables as intermediate noise states along a diffusion trajectory and explicitly mapping bit-rate to pseudo-time steps, thereby enabling arbitrary-bit-rate reconstruction via a single model and a single forward pass. OSCAR integrates latent-space compression, structure-aware single-step denoising, conditional time-step embedding, and diffusion prior modeling. Experiments demonstrate that OSCAR consistently outperforms state-of-the-art diffusion-based compression methods in PSNR, LPIPS, and perceptual quality, while accelerating inference by over an order of magnitude and substantially reducing both computational and storage costs.

Technology Category

Application Category

πŸ“ Abstract
Pretrained latent diffusion models have shown strong potential for lossy image compression, owing to their powerful generative priors. Most existing diffusion-based methods reconstruct images by iteratively denoising from random noise, guided by compressed latent representations. While these approaches have achieved high reconstruction quality, their multi-step sampling process incurs substantial computational overhead. Moreover, they typically require training separate models for different compression bit-rates, leading to significant training and storage costs. To address these challenges, we propose a one-step diffusion codec across multiple bit-rates. termed OSCAR. Specifically, our method views compressed latents as noisy variants of the original latents, where the level of distortion depends on the bit-rate. This perspective allows them to be modeled as intermediate states along a diffusion trajectory. By establishing a mapping from the compression bit-rate to a pseudo diffusion timestep, we condition a single generative model to support reconstructions at multiple bit-rates. Meanwhile, we argue that the compressed latents retain rich structural information, thereby making one-step denoising feasible. Thus, OSCAR replaces iterative sampling with a single denoising pass, significantly improving inference efficiency. Extensive experiments demonstrate that OSCAR achieves superior performance in both quantitative and visual quality metrics. The code and models will be released at https://github.com/jp-guo/OSCAR.
Problem

Research questions and friction points this paper is trying to address.

Reduces computational overhead in diffusion-based image compression
Eliminates need for separate models per bit-rate
Enables high-quality one-step denoising for efficient reconstruction
Innovation

Methods, ideas, or system contributions that make the work stand out.

One-step denoising for efficient image reconstruction
Single generative model supports multiple bit-rates
Compressed latents modeled as diffusion trajectory states
πŸ”Ž Similar Papers
No similar papers found.
Jinpei Guo
Jinpei Guo
Carnegie Mellon University
Deep LearningCombinatorial OptimizationGenerative AI
Y
Yifei Ji
Shanghai Jiao Tong University
Z
Zheng Chen
Shanghai Jiao Tong University
K
Kai Liu
Shanghai Jiao Tong University
M
Min Liu
Carnegie Mellon University
Wang Rao
Wang Rao
Carnegie Mellon University
Machine Learning
Wenbo Li
Wenbo Li
The Chinese University of Hong Kong
Computer VisionDeep Learning
Y
Yong Guo
South China University of Technology
Y
Yulun Zhang
Shanghai Jiao Tong University