🤖 AI Summary
This work addresses the fidelity degradation in diffusion-based image compression caused by the introduction of random noise during reconstruction. To mitigate this issue, the authors propose Noise-Constrained Diffusion (NC-Diffusion), a novel framework that, for the first time, utilizes quantization noise from learned image compression as the noise source in the forward diffusion process, thereby establishing a constrained diffusion path from the original image to its initial compressed representation. The method further incorporates frequency-domain adaptive filtering to enhance high-frequency details and introduces a zero-shot guidance mechanism to improve reconstruction quality. Experimental results demonstrate that NC-Diffusion achieves state-of-the-art performance across multiple benchmark datasets, significantly enhancing both fidelity and visual quality of compressed images.
📝 Abstract
With the great success of diffusion models in image generation, diffusion-based image compression is attracting increasing interests. However, due to the random noise introduced in the diffusion learning, they usually produce reconstructions with deviation from the original images, leading to suboptimal compression results. To address this problem, in this paper, we propose a Noise Constrained Diffusion (NC-Diffusion) framework for high fidelity image compression. Unlike existing diffusion-based compression methods that add random Gaussian noise and direct the noise into the image space, the proposed NC-Diffusion formulates the quantization noise originally added in the learned image compression as the noise in the forward process of diffusion. Then a noise constrained diffusion process is constructed from the ground-truth image to the initial compression result generated with quantization noise. The NC-Diffusion overcomes the problem of noise mismatch between compression and diffusion, significantly improving the inference efficiency. In addition, an adaptive frequency-domain filtering module is developed to enhance the skip connections in the U-Net based diffusion architecture, in order to enhance high-frequency details. Moreover, a zero-shot sample-guided enhancement method is designed to further improve the fidelity of the image. Experiments on multiple benchmark datasets demonstrate that our method can achieve the best performance compared with existing methods.