Generative Preprocessing for Image Compression with Pre-trained Diffusion Models

📅 2025-12-17

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

To address the limitation of conventional rate-distortion (R-D) optimization in image compression—its overreliance on pixel-level fidelity at the expense of perceptual quality—this paper introduces, for the first time, large-scale pre-trained diffusion models into compression pre-processing, proposing a novel rate-perception (R-P) optimization paradigm. Methodologically: (1) Stable Diffusion 2.1 is distilled into a single-step image-to-image translation model; (2) its attention modules are efficiently fine-tuned via Consistent Score Identity Distillation (CiD) coupled with a differentiable codec surrogate; (3) a joint rate-perception loss is formulated. This generative pre-processor requires no modification to standard codecs and leverages diffusion priors to enhance texture fidelity and suppress artifacts. Evaluated on the Kodak dataset, the method achieves a 30.13% BD-rate reduction under the DISTS metric, demonstrating substantial gains in subjective visual quality.

Technology Category

Application Category

📝 Abstract

Preprocessing is a well-established technique for optimizing compression, yet existing methods are predominantly Rate-Distortion (R-D) optimized and constrained by pixel-level fidelity. This work pioneers a shift towards Rate-Perception (R-P) optimization by, for the first time, adapting a large-scale pre-trained diffusion model for compression preprocessing. We propose a two-stage framework: first, we distill the multi-step Stable Diffusion 2.1 into a compact, one-step image-to-image model using Consistent Score Identity Distillation (CiD). Second, we perform a parameter-efficient fine-tuning of the distilled model's attention modules, guided by a Rate-Perception loss and a differentiable codec surrogate. Our method seamlessly integrates with standard codecs without any modification and leverages the model's powerful generative priors to enhance texture and mitigate artifacts. Experiments show substantial R-P gains, achieving up to a 30.13% BD-rate reduction in DISTS on the Kodak dataset and delivering superior subjective visual quality.

Problem

Research questions and friction points this paper is trying to address.

Optimizes image compression using Rate-Perception instead of Rate-Distortion.

Adapts pre-trained diffusion models for compression preprocessing without codec changes.

Enhances texture and reduces artifacts via generative priors and fine-tuning.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses pre-trained diffusion model for compression preprocessing

Distills Stable Diffusion into compact one-step model

Fine-tunes attention modules with Rate-Perception loss

🔎 Similar Papers

PSC: Posterior Sampling-Based Compression