π€ AI Summary
This work proposes a multilingual framework for automatic intelligibility assessment of dysarthric speech, addressing the limitations of existing approaches that are often confined to a single language and struggle to model language-specific factors. The framework integrates universal phoneme recognition with language-specific phonemic mappings derived from contrastive phonological features, and incorporates sequence alignment to generate multidimensional intelligibility metrics. Notably, it introduces Phoneme Coverage (PhonCov)βa novel, alignment-free metricβthat, together with Phone Error Rate (PER) and Phone Feature Error Rate (PFER), forms a comprehensive evaluation suite. Experiments on English, Spanish, Italian, and Tamil demonstrate that the framework effectively captures clinically relevant patterns of intelligibility degradation, with individual metrics benefiting differentially from phonemic mapping, alignment, or their combination.
π Abstract
The growing prevalence of neurological disorders associated with dysarthria motivates the need for automated intelligibility assessment methods that are applicalbe across languages. However, most existing approaches are either limited to a single language or fail to capture language-specific factors shaping intelligibility. We present a multilingual phoneme-production assessment framework that integrates universal phone recognition with language-specific phoneme interpretation using contrastive phonological feature distances for phone-to-phoneme mapping and sequence alignment. The framework yields three metrics: phoneme error rate (PER), phonological feature error rate (PFER), and a newly proposed alignment-free measure, phoneme coverage (PhonCov). Analysis on English, Spanish, Italian, and Tamil show that PER benefits from the combination of mapping and alignment, PFER from alignment alone, and PhonCov from mapping. Further analyses demonstrate that the proposed framework captures clinically meaningful patterns of intelligibility degradation consistent with established observations of dysarthric speech.