Switchable Token-Specific Codebook Quantization For Face Image Compression

📅 2025-10-26

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

Existing global shared codebook approaches neglect intra-face semantic correlations and token-level semantic disparities, leading to suboptimal reconstruction quality and face recognition performance at ultra-low bitrates (e.g., 0.05 bpp). To address this, we propose a switchable token-specific codebook quantization framework: first, codebooks are learned independently per semantic category; then, each visual token is dynamically assigned its most suitable dedicated codebook, enabling fine-grained, low-distortion quantization. Our method is the first to jointly couple token-level codebook selection with category-aware grouping—reducing individual codebook size while enhancing representational diversity and quantization fidelity. Experiments demonstrate that reconstructed face images achieve a mean recognition accuracy of 93.51% at 0.05 bpp, significantly outperforming global codebook baselines. This work establishes a novel paradigm for codebook-driven face compression models.

Technology Category

Application Category

📝 Abstract

With the ever-increasing volume of visual data, the efficient and lossless transmission, along with its subsequent interpretation and understanding, has become a critical bottleneck in modern information systems. The emerged codebook-based solution utilize a globally shared codebook to quantize and dequantize each token, controlling the bpp by adjusting the number of tokens or the codebook size. However, for facial images, which are rich in attributes, such global codebook strategies overlook both the category-specific correlations within images and the semantic differences among tokens, resulting in suboptimal performance, especially at low bpp. Motivated by these observations, we propose a Switchable Token-Specific Codebook Quantization for face image compression, which learns distinct codebook groups for different image categories and assigns an independent codebook to each token. By recording the codebook group to which each token belongs with a small number of bits, our method can reduce the loss incurred when decreasing the size of each codebook group. This enables a larger total number of codebooks under a lower overall bpp, thereby enhancing the expressive capability and improving reconstruction performance. Owing to its generalizable design, our method can be integrated into any existing codebook-based representation learning approach and has demonstrated its effectiveness on face recognition datasets, achieving an average accuracy of 93.51% for reconstructed images at 0.05 bpp.

Problem

Research questions and friction points this paper is trying to address.

Improves face image compression at low bit rates

Addresses limitations of global codebook quantization methods

Enhances reconstruction quality through token-specific codebook groups

Innovation

Methods, ideas, or system contributions that make the work stand out.

Learns distinct codebook groups for image categories

Assigns independent codebook to each token

Records token-codebook mapping with minimal bits

🔎 Similar Papers

Quantization-aware Matrix Factorization for Low Bit Rate Image Compression