ArcVQ-VAE: A Spherical Vector Quantization Framework with ArcCosine Additive Margin

📅 2026-05-13
📈 Citations: 0
Influential: 0
📄 PDF

career value

179K/year
🤖 AI Summary
This work addresses the limitation of conventional VQ-VAE models in learning rich and discriminative discrete representations due to constrained codebook capacity. To overcome this, the authors propose a Spherical Angular Margin Prior (SAMP), which enhances angular separation and promotes uniform spherical coverage among codebook vectors through a ball-bounded norm constraint and an arccosine additive margin loss. This approach yields a novel spherical vector quantization framework that significantly improves representation diversity, reconstruction fidelity, and sample quality in image reconstruction and generation tasks. Empirical results demonstrate that the proposed method either surpasses or matches the performance of current state-of-the-art baselines.
📝 Abstract
Vector Quantized Variational Autoencoder (VQ-VAE) has become a fundamental framework for learning discrete representations in image modeling. However, VQ-VAE models must tokenize entire images using a finite set of codebook vectors, and this capacity limitation restricts their ability to capture rich and diverse representations. In this paper, we propose ArcCosine Additive Margin VQ-VAE (ArcVQ-VAE), a novel vector quantization framework that introduces a spherical angular-margin prior (SAMP) for the codebook of a conventional VQ-VAE. The proposed SAMP consists of Ball-Bounded Norm Regularization, which constrains all codebook vectors within a time-dependent Euclidean ball, and ArcCosine Additive Margin Loss, which encourages greater angular separability among latent vectors. This formulation promotes more discriminative and uniformly dispersed latent representations within the constrained space, thereby improving effective latent-space coverage and leading to improved codebook utilization. Experimental results on standard image reconstruction and generation tasks show that ArcVQ-VAE achieves competitive performance against baseline models in terms of reconstruction accuracy, representation diversity, and sample quality. The code is available at: https://github.com/goals4292/ArcVQ-VAE
Problem

Research questions and friction points this paper is trying to address.

VQ-VAE
discrete representations
codebook capacity
representation diversity
image modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Vector Quantization
Spherical Angular-Margin Prior
ArcCosine Additive Margin
Codebook Utilization
Discrete Representation Learning
🔎 Similar Papers