Once-for-All: Controllable Generative Image Compression with Dynamic Granularity Adaption

📅 2024-06-02
🏛️ arXiv.org
📈 Citations: 4
Influential: 2
📄 PDF

career value

212K/year
🤖 AI Summary
Generative image compression suffers from inflexible bit-rate control: existing methods struggle to simultaneously achieve high reconstruction fidelity and strong generalization across a wide, fine-grained range of bit rates. This paper proposes the first framework enabling continuous, controllable bit-rate adaptation. Our method builds upon the VQGAN architecture and integrates three key components: (1) an information-density-driven dynamic granularity adaptation mechanism that explicitly links local image complexity to vector quantization (VQ) codebook length; (2) a probabilistic conditional hierarchical decoder for multi-granularity feature reconstruction and conditional feature aggregation; and (3) variable-length VQ with density-aware granularity allocation and layered probabilistic modeling. Extensive experiments demonstrate substantial improvements over state-of-the-art methods across multiple benchmarks, achieving superior trade-offs between rate-distortion performance and perceptual quality.

Technology Category

Application Category

📝 Abstract
Although recent generative image compression methods have demonstrated impressive potential in optimizing the rate-distortion-perception trade-off, they still face the critical challenge of flexible rate adaption to diverse compression necessities and scenarios. To overcome this challenge, this paper proposes a Controllable Generative Image Compression framework, termed Control-GIC, the first capable of fine-grained bitrate adaption across a broad spectrum while ensuring high-fidelity and generality compression. Control-GIC is grounded in a VQGAN framework that encodes an image as a sequence of variable-length codes (i.e. VQ-indices), which can be losslessly compressed and exhibits a direct positive correlation with the bitrates. Drawing inspiration from the classical coding principle, we correlate the information density of local image patches with their granular representations. Hence, we can flexibly determine a proper allocation of granularity for the patches to achieve dynamic adjustment for VQ-indices, resulting in desirable compression rates. We further develop a probabilistic conditional decoder capable of retrieving historic encoded multi-granularity representations according to transmitted codes, and then reconstruct hierarchical granular features in the formalization of conditional probability, enabling more informative aggregation to improve reconstruction realism. Our experiments show that Control-GIC allows highly flexible and controllable bitrate adaption where the results demonstrate its superior performance over recent state-of-the-art methods.
Problem

Research questions and friction points this paper is trying to address.

Flexible rate adaptation for diverse image compression needs
Fine-grained bitrate control ensuring high-fidelity compression
Dynamic granularity adjustment for optimal rate-distortion trade-off
Innovation

Methods, ideas, or system contributions that make the work stand out.

VQGAN framework with variable-length codes
Dynamic granularity adaptation for bitrate control
Probabilistic conditional decoder for realistic reconstruction
🔎 Similar Papers
No similar papers found.
A
Anqi Li
Institute of Information Science, Beijing Jiaotong University
Yuxi Liu
Yuxi Liu
University of California, Berkeley
general relativityquantum mechanicsneural network
H
H. Bai
Institute of Information Science, Beijing Jiaotong University
F
Feng Li
School of Computer Science and Engineering, Hefei University of Technology
R
Runmin Cong
School of Control Science and Engineering, Shandong University
M
Meng Wang
School of Computer Science and Engineering, Hefei University of Technology
Y
Yao Zhao
Institute of Information Science, Beijing Jiaotong University