Emotion-Aware Quantization for Discrete Speech Representations: An Analysis of Emotion Preservation

📅 2026-03-22

📈 Citations: 0

✨ Influential: 0

career value

225K/year

🤖 AI Summary

This work addresses the severe and uneven loss of emotional information in discretized speech representations under high compression rates. To mitigate this issue, the authors analyze the impact of residual vector quantization (RVQ) on emotional content from both representational and task-oriented perspectives and propose Emo-Q, an emotion-aware quantization method. Emo-Q preserves emotional characteristics at low bitrates by constructing emotion-specific and emotion-biased codebooks alongside a lightweight routing mechanism. Experimental results demonstrate that the proposed approach significantly alleviates emotion degradation across diverse model architectures and emotion categories, leading to improved accuracy in emotion recognition tasks.

Technology Category

Application Category

📝 Abstract

Modern speech systems increasingly use discretized self-supervised speech representations for compression and integration with token-based models, yet their impact on emotional information remains unclear. We study how residual vector quantization (RVQ) reshapes emotional information in discrete speech representations from both representation- and task-level perspectives. Our analysis shows that aggressive compression disproportionately degrades emotion, with uneven loss across emotion classes and model architectures. To address this, we introduce emotion-aware quantization using emotion-specific and emotion-biased codebooks, improving the preservation of both hard and soft emotion perception. We further propose Emo-Q, a lightweight routed quantization method that selects emotion-specialized codebooks, improving emotion recognition performance at lower bitrates. These results highlight the importance of emotion-aware discretization for robust affective speech processing.

Problem

Research questions and friction points this paper is trying to address.

emotion preservation

discrete speech representations

vector quantization

affective speech processing

speech compression

Innovation

Methods, ideas, or system contributions that make the work stand out.

emotion-aware quantization

discrete speech representations

residual vector quantization