Spherical Leech Quantization for Visual Tokenization and Generation

📅 2025-12-16

📈 Citations: 0

✨ Influential: 0

career value

226K/year

🤖 AI Summary

Addressing the fundamental trade-off between reconstruction fidelity and compression efficiency in visual tokenization and generation, this paper proposes a spherical non-parametric quantization method based on the 24-dimensional Leech lattice. It is the first work to introduce the highly symmetric Leech lattice into quantizer design, leveraging spherical uniform sampling and lattice coding theory to achieve efficient, auxiliary-loss-free, lookup-table-free tokenization. The method integrates seamlessly into both autoencoder-based and autoregressive (AR) generative frameworks. In image tokenization, it consistently outperforms BSQ—achieving significant PSNR/SSIM gains while reducing bit-rate by approximately 1.2%; in AR image generation, it lowers FID by 8.3%, markedly improving visual fidelity and structural consistency. The core innovation lies in exploiting the Leech lattice’s optimal spherical covering property to unify non-parametric quantization modeling, thereby overcoming the classical scalar/vector quantization trade-off bottleneck.

Technology Category

Application Category

📝 Abstract

Non-parametric quantization has received much attention due to its efficiency on parameters and scalability to a large codebook. In this paper, we present a unified formulation of different non-parametric quantization methods through the lens of lattice coding. The geometry of lattice codes explains the necessity of auxiliary loss terms when training auto-encoders with certain existing lookup-free quantization variants such as BSQ. As a step forward, we explore a few possible candidates, including random lattices, generalized Fibonacci lattices, and densest sphere packing lattices. Among all, we find the Leech lattice-based quantization method, which is dubbed as Spherical Leech Quantization ($Lambda_{24}$-SQ), leads to both a simplified training recipe and an improved reconstruction-compression tradeoff thanks to its high symmetry and even distribution on the hypersphere. In image tokenization and compression tasks, this quantization approach achieves better reconstruction quality across all metrics than BSQ, the best prior art, while consuming slightly fewer bits. The improvement also extends to state-of-the-art auto-regressive image generation frameworks.

Problem

Research questions and friction points this paper is trying to address.

Unified formulation of non-parametric quantization via lattice coding

Improving reconstruction-compression tradeoff in image tokenization

Enhancing auto-regressive image generation with efficient quantization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leech lattice-based quantization for visual tokenization

Simplified training with high symmetry hypersphere distribution

Improved reconstruction-compression tradeoff in image generation

🔎 Similar Papers

Towards Semantic Equivalence of Tokenization in Multimodal LLM