Towards Bitrate-Efficient and Noise-Robust Speech Coding with Variable Bitrate RVQ

📅 2025-06-19

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

To address rigid bit-rate allocation and redundant bit consumption by noise components in residual vector quantization (RVQ) speech coding under realistic noisy conditions, this paper proposes a variable-bit-rate RVQ (VRVQ) framework. Our method introduces: (1) the first noise-aware, dynamic frame-level bit-rate allocation mechanism, which adaptively adjusts quantization precision based on speech saliency and noise intensity; and (2) the first end-to-end jointly optimized neural feature-domain denoiser integrated into the RVQ quantization loop. Under rate-distortion joint optimization, VRVQ significantly improves coding efficiency across diverse noise scenarios: at equal bit rates, it achieves a mean opinion score (MOS) gain of ≥0.8 over baseline methods, with superior speech intelligibility and subjective quality compared to constant-bit-rate RVQ (CBR-RVQ) and conventional codecs.

Technology Category

Application Category

📝 Abstract

Residual Vector Quantization (RVQ) has become a dominant approach in neural speech and audio coding, providing high-fidelity compression. However, speech coding presents additional challenges due to real-world noise, which degrades compression efficiency. Standard codecs allocate bits uniformly, wasting bitrate on noise components that do not contribute to intelligibility. This paper introduces a Variable Bitrate RVQ (VRVQ) framework for noise-robust speech coding, dynamically adjusting bitrate per frame to optimize rate-distortion trade-offs. Unlike constant bitrate (CBR) RVQ, our method prioritizes critical speech components while suppressing residual noise. Additionally, we integrate a feature denoiser to further improve noise robustness. Experimental results show that VRVQ improves rate-distortion trade-offs over conventional methods, achieving better compression efficiency and perceptual quality in noisy conditions. Samples are available at our project page: https://yoongi43.github.io/noise_robust_vrvq/.

Problem

Research questions and friction points this paper is trying to address.

Optimize bitrate allocation for noisy speech coding

Enhance compression efficiency by suppressing residual noise

Improve noise robustness with dynamic bitrate adjustment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Variable Bitrate RVQ for dynamic bitrate adjustment

Feature denoiser enhances noise robustness

Prioritizes critical speech components over noise

🔎 Similar Papers

Variable Bitrate Residual Vector Quantization for Audio Coding