Reference-aware SFM layers for intrusive intelligibility prediction

📅 2025-09-21

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

Existing intrusive speech intelligibility predictors rely on explicit reference signals but suffer from suboptimal utilization of speech foundation models (SFMs), limiting their performance. This paper proposes a reference-conditioned, multi-layer SFM joint modeling framework that enables fine-grained intelligibility prediction via reference-aligned feature extraction, hierarchical SFM feature fusion, and deep regression modeling. Our key contribution is the introduction of a reference-aware mechanism—the first systematic effort to unlock the representational potential of SFMs for intrusive intelligibility assessment—thereby establishing a novel reference-driven paradigm. Evaluated on the CPC3 challenge, our method achieves state-of-the-art performance: RMSE of 22.36 on the development set and 24.98 on the test set, significantly outperforming all existing intrusive approaches.

Technology Category

Application Category

📝 Abstract

Intrusive speech-intelligibility predictors that exploit explicit reference signals are now widespread, yet they have not consistently surpassed non-intrusive systems. We argue that a primary cause is the limited exploitation of speech foundation models (SFMs). This work revisits intrusive prediction by combining reference conditioning with multi-layer SFM representations. Our final system achieves RMSE 22.36 on the development set and 24.98 on the evaluation set, ranking 1st on CPC3. These findings provide practical guidance for constructing SFM-based intrusive intelligibility predictors.

Problem

Research questions and friction points this paper is trying to address.

Improving intrusive speech intelligibility prediction accuracy

Enhancing reference signal utilization with foundation models

Overcoming limitations of current intrusive prediction systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Reference-aware SFM layers for intrusive prediction

Multi-layer SFM representations with reference conditioning

SFM-based intrusive intelligibility predictor construction guidance

🔎 Similar Papers

No similar papers found.

Bosch Group

Renningen, BW, DE

AI Research Scientist - Meta Superintelligence Labs (PhD)