CADC: Content Adaptive Diffusion-Based Generative Image Compression

📅 2026-02-25

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

This work addresses the limitations of existing diffusion-based generative image compression methods, which suffer from inadequate content adaptivity due to uniform quantization, mismatched latent space dimensions, and either high overhead or weak semantic generalization in text guidance, leading to suboptimal reconstruction quality and semantic fidelity. To overcome these challenges, we propose a content-adaptive diffusion compression framework that jointly optimizes the encoded representation and decoding prior to achieve highly realistic reconstruction at ultra-low bitrates. Key innovations include uncertainty-guided adaptive quantization, an auxiliary decoder-driven information concentration mechanism, and an adaptive text-conditioning strategy that incurs zero bitrate overhead. Experimental results demonstrate that our approach significantly improves detail preservation, structural alignment, and semantic consistency in reconstructed images at extremely low bitrates.

Technology Category

Application Category

📝 Abstract

Diffusion-based generative image compression has demonstrated remarkable potential for achieving realistic reconstruction at ultra-low bitrates. The key to unlocking this potential lies in making the entire compression process content-adaptive, ensuring that the encoder's representation and the decoder's generative prior are dynamically aligned with the semantic and structural characteristics of the input image. However, existing methods suffer from three critical limitations that prevent effective content adaptation. First, isotropic quantization applies a uniform quantization step, failing to adapt to the spatially varying complexity of image content and creating a misalignment with the diffusion model's noise-dependent prior. Second, the information concentration bottleneck -- arising from the dimensional mismatch between the high-dimensional noisy latent and the diffusion decoder's fixed input -- prevents the model from adaptively preserving essential semantic information in the primary channels. Third, existing textual conditioning strategies either need significant textual bitrate overhead or rely on generic, content-agnostic textual prompts, thereby failing to provide adaptive semantic guidance efficiently. To overcome these limitations, we propose a content-adaptive diffusion-based image codec with three technical innovations: 1) an Uncertainty-Guided Adaptive Quantization method that learns spatial uncertainty maps to adaptively align quantization distortion with content characteristics; 2) an Auxiliary Decoder-Guided Information Concentration method that uses a lightweight auxiliary decoder to enforce content-aware information preservation in the primary latent channels; and 3) a Bitrate-Free Adaptive Textual Conditioning method that derives content-aware textual descriptions from the auxiliary reconstructed image, enabling semantic guidance without bitrate cost.

Problem

Research questions and friction points this paper is trying to address.

content-adaptive compression

diffusion-based image compression

adaptive quantization

information concentration bottleneck

textual conditioning

Innovation

Methods, ideas, or system contributions that make the work stand out.

content-adaptive compression

diffusion-based generative model

adaptive quantization