Generalized Radius and Integrated Codebook Transforms for Differentiable Vector Quantization

📅 2026-02-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of traditional vector quantization (VQ), which relies on non-differentiable hard nearest-neighbor assignments and heuristic straight-through estimators, leading to unstable gradients and poor codebook utilization. To overcome these issues, the authors propose GRIT-VQ, a fully differentiable VQ framework that preserves hard assignment during the forward pass while introducing a geometry-aware, radius-guided latent update mechanism and a shared-parameter ensemble codebook transformation. These innovations jointly promote collaborative codebook evolution, stabilize gradient flow, and effectively prevent codebook collapse. Experimental results demonstrate that GRIT-VQ significantly improves performance across image reconstruction, generative modeling, and recommendation tasks, while substantially enhancing codebook utilization.

Technology Category

Application Category

📝 Abstract
Vector quantization (VQ) underpins modern generative and representation models by turning continuous latents into discrete tokens. Yet hard nearest-neighbor assignments are non-differentiable and are typically optimized with heuristic straight-through estimators, which couple the update step size to the quantization gap and train each code in isolation, leading to unstable gradients and severe codebook under-utilization at scale. In this paper, we introduce GRIT-VQ (Generalized Radius and Integrated Transform-Vector Quantization), a unified surrogate framework that keeps hard assignments in the forward pass while making VQ fully differentiable. GRIT-VQ replaces the straight-through estimator with a radius-based update that moves latents along the quantization direction with a controllable, geometry-aware step, and applies a data-agnostic integrated transform to the codebook so that all codes are updated through shared parameters instead of independently. Our theoretical analysis clarifies the fundamental optimization dynamics introduced by GRIT-VQ, establishing conditions for stable gradient flow, coordinated codebook evolution, and reliable avoidance of collapse across a broad family of quantizers. Across image reconstruction, image generation, and recommendation tokenization benchmarks, GRIT-VQ consistently improves reconstruction error, generative quality, and recommendation accuracy while substantially increasing codebook utilization compared to existing VQ variants.
Problem

Research questions and friction points this paper is trying to address.

Vector Quantization
Non-differentiability
Codebook Under-utilization
Straight-through Estimator
Gradient Instability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Differentiable Vector Quantization
GRIT-VQ
Codebook Utilization
Radius-based Update
Integrated Codebook Transform
🔎 Similar Papers
2024-10-08IEEE International Conference on Acoustics, Speech, and Signal ProcessingCitations: 0