CoD-Lite: Real-Time Diffusion-Based Generative Image Compression

📅 2026-04-14
📈 Citations: 0
Influential: 0
📄 PDF

career value

203K/year
🤖 AI Summary
Existing diffusion models struggle to balance generation quality and inference efficiency in lightweight real-time image compression. This work proposes a lightweight, single-step diffusion-based codec tailored for compression tasks, eschewing conventional Transformer architectures in favor of a compact convolutional design enhanced by compression-oriented pretraining, knowledge distillation, and adversarial learning. The proposed method achieves real-time encoding and decoding at 1080p resolution—60 FPS for encoding and 42 FPS for decoding—for the first time in diffusion-based compression. It attains a comparable FID to MS-ILLM while reducing bitrate by 85%, demonstrating that compression-oriented pretraining significantly outperforms generation-oriented strategies. This study establishes a new paradigm for efficient diffusion-based image compression.

Technology Category

Application Category

📝 Abstract
Recent advanced diffusion methods typically derive strong generative priors by scaling diffusion transformers. However, scaling fails to generalize when adapted for real-time compression scenarios that demand lightweight models. In this paper, we explore the design of real-time and lightweight diffusion codecs by addressing two pivotal questions. First, does diffusion pre-training benefit lightweight diffusion codecs? Through systematic analysis, we find that generation-oriented pre-training is less effective at small model scales whereas compression-oriented pre-training yields consistently better performance. Second, are transformers essential? We find that while global attention is crucial for standard generation, lightweight convolutions suffice for compression-oriented diffusion when paired with distillation. Guided by these findings, we establish a one-step lightweight convolution diffusion codec that achieves real-time $60$~FPS encoding and $42$~FPS decoding at 1080p. Further enhanced by distillation and adversarial learning, the proposed codec reduces bitrate by 85\% at a comparable FID to MS-ILLM, bridging the gap between generative compression and practical real-time deployment. Codes are released at https://github.com/microsoft/GenCodec/CoD_Lite
Problem

Research questions and friction points this paper is trying to address.

generative image compression
real-time compression
lightweight diffusion models
diffusion codecs
model scaling
Innovation

Methods, ideas, or system contributions that make the work stand out.

lightweight diffusion codec
compression-oriented pre-training
convolution-based diffusion
real-time generative compression
distillation
🔎 Similar Papers
No similar papers found.