Differentiable Vector Quantization for Rate-Distortion Optimization of Generative Image Compression

📅 2026-04-12

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

Existing vector quantization–based image compression methods suffer from a disconnect between representation learning and entropy modeling due to the absence of end-to-end rate-distortion (RD) joint optimization, making it challenging to simultaneously preserve structural fidelity and achieve high compression efficiency at extremely low bitrates. This work proposes the RDVQ framework, which for the first time explicitly incorporates entropy constraints into vector quantization by employing a differentiable relaxation of codebook distributions, thereby enabling true RD joint optimization. The resulting entropy loss directly guides the learning of latent priors, while an autoregressive entropy model facilitates precise entropy estimation and bitrate control during inference. Unifying image tokenization and compression within a single framework, RDVQ significantly outperforms RDEIC on DIV2K-val with fewer parameters—achieving a 75.71% reduction in DISTS bitrate and a 37.63% improvement in LPIPS—while delivering competitive perceptual quality.

Technology Category

Application Category

📝 Abstract

The rapid growth of visual data under stringent storage and bandwidth constraints makes extremely low-bitrate image compression increasingly important. While Vector Quantization (VQ) offers strong structural fidelity, existing methods lack a principled mechanism for joint rate-distortion (RD) optimization due to the disconnect between representation learning and entropy modeling. We propose RDVQ, a unified framework that enables end-to-end RD optimization for VQ-based compression via a differentiable relaxation of the codebook distribution, allowing the entropy loss to directly shape the latent prior. We further develop an autoregressive entropy model that supports accurate entropy modeling and test-time rate control. Extensive experiments demonstrate that RDVQ achieves strong performance at extremely low bitrates with a lightweight architecture, attaining competitive or superior perceptual quality with significantly fewer parameters. Compared with RDEIC, RDVQ reduces bitrate by up to 75.71% on DISTS and 37.63% on LPIPS on DIV2K-val. Beyond empirical gains, RDVQ introduces an entropy-constrained formulation of VQ, highlighting the potential for a more unified view of image tokenization and compression. The code will be available at https://github.com/CVL-UESTC/RDVQ.

Problem

Research questions and friction points this paper is trying to address.

Vector Quantization

Rate-Distortion Optimization

Generative Image Compression

Entropy Modeling

Low-Bitrate Compression

Innovation

Methods, ideas, or system contributions that make the work stand out.

Differentiable Vector Quantization

Rate-Distortion Optimization

Generative Image Compression