Robust Residual Finite Scalar Quantization for Neural Compression

📅 2025-08-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the progressive attenuation of residual magnitudes in residual quantization caused by finite scalar quantization (FSQ), this paper proposes Robust Finite Scalar Quantization (R-FSQ). The core innovation lies in integrating learnable scaling factors and invertible LayerNorm into the FSQ framework—preserving FSQ’s architectural simplicity and training stability while effectively mitigating signal attenuation and enhancing multi-stage quantization capability. R-FSQ is fully compatible with standard residual quantization architectures and requires no additional hyperparameter tuning. Evaluated on ImageNet, R-FSQ achieves a 45% reduction in perceptual loss and a 28.7% decrease in L1 reconstruction error compared to both VQ-EMA and vanilla FSQ, demonstrating substantial improvements in reconstruction fidelity and compression efficiency.

Technology Category

Application Category

📝 Abstract
Finite Scalar Quantization (FSQ) has emerged as a promising alternative to Vector Quantization (VQ) in neural compression, offering simplified training and improved stability. However, naive application of FSQ in residual quantization frameworks suffers from the extbf{residual magnitude decay problem}, where subsequent FSQ layers receive progressively weaker signals, severely limiting their effectiveness. We propose extbf{Robust Residual Finite Scalar Quantization (RFSQ)}, a general framework that addresses this fundamental limitation through two novel conditioning strategies: learnable scaling factors and invertible layer normalization. Our approach maintains the simplicity of FSQ while enabling effective multi-stage residual quantization. Comprehensive experiments on ImageNet demonstrate that RFSQ variants significantly outperform strong baselines including VQ-EMA, FSQ, and LFQ, achieving up to 45% improvement in perceptual loss and 28.7% reduction in L1 reconstruction error. The proposed LayerNorm strategy shows the most consistent improvements across different configurations, establishing RFSQ as a superior quantization method for neural compression.
Problem

Research questions and friction points this paper is trying to address.

Addresses residual magnitude decay in neural compression quantization
Proposes robust framework with learnable scaling and normalization
Enables effective multi-stage residual quantization while maintaining simplicity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Robust Residual FSQ framework with conditioning strategies
Learnable scaling factors for signal strength maintenance
Invertible layer normalization for consistent quantization performance
🔎 Similar Papers
No similar papers found.