🤖 AI Summary
Existing diffusion-based image compression methods require separate training for each target bit rate, limiting flexibility. This paper proposes PSC, a zero-shot compression framework that achieves rate-adaptive compression using only a pre-trained diffusion model—without additional training or transmission of transform parameters. Its core innovations are: (i) adaptive construction of an image-dependent orthogonal transform matrix, consistently applied at both encoder and decoder; and (ii) zero-shot posterior sampling within the quantized measurement space to progressively reduce uncertainty. The method integrates standard uniform quantization with entropy coding while ensuring strict encoder–decoder consistency. Experiments demonstrate that PSC matches supervised methods in both rate-distortion performance and perceptual quality. Crucially, it supports arbitrary target bit rates or distortion thresholds at inference time, significantly enhancing deployment flexibility and generalization across diverse compression requirements.
📝 Abstract
Diffusion models have transformed the landscape of image generation and now show remarkable potential for image compression. Most of the recent diffusion-based compression methods require training and are tailored for a specific bit-rate. In this work, we propose Posterior Sampling-based Compression (PSC) - a zero-shot compression method that leverages a pre-trained diffusion model as its sole neural network component, thus enabling the use of diverse, publicly available models without additional training. Our approach is inspired by transform coding methods, which encode the image in some pre-chosen transform domain. However, PSC constructs a transform that is adaptive to the image. This is done by employing a zero-shot diffusion-based posterior sampler so as to progressively construct the rows of the transform matrix. Each new chunk of rows is chosen to reduce the uncertainty about the image given the quantized measurements collected thus far. Importantly, the same adaptive scheme can be replicated at the decoder, thus avoiding the need to encode the transform itself. We demonstrate that even with basic quantization and entropy coding, PSC's performance is comparable to established training-based methods in terms of rate, distortion, and perceptual quality. This is while providing greater flexibility, allowing to choose at inference time any desired rate or distortion.