Multilingual Dysarthric Speech Assessment Using Universal Phone Recognition and Language-Specific Phonemic Contrast Modeling

πŸ“… 2026-01-29
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work proposes a multilingual framework for automatic intelligibility assessment of dysarthric speech, addressing the limitations of existing approaches that are often confined to a single language and struggle to model language-specific factors. The framework integrates universal phoneme recognition with language-specific phonemic mappings derived from contrastive phonological features, and incorporates sequence alignment to generate multidimensional intelligibility metrics. Notably, it introduces Phoneme Coverage (PhonCov)β€”a novel, alignment-free metricβ€”that, together with Phone Error Rate (PER) and Phone Feature Error Rate (PFER), forms a comprehensive evaluation suite. Experiments on English, Spanish, Italian, and Tamil demonstrate that the framework effectively captures clinically relevant patterns of intelligibility degradation, with individual metrics benefiting differentially from phonemic mapping, alignment, or their combination.

Technology Category

Application Category

πŸ“ Abstract
The growing prevalence of neurological disorders associated with dysarthria motivates the need for automated intelligibility assessment methods that are applicalbe across languages. However, most existing approaches are either limited to a single language or fail to capture language-specific factors shaping intelligibility. We present a multilingual phoneme-production assessment framework that integrates universal phone recognition with language-specific phoneme interpretation using contrastive phonological feature distances for phone-to-phoneme mapping and sequence alignment. The framework yields three metrics: phoneme error rate (PER), phonological feature error rate (PFER), and a newly proposed alignment-free measure, phoneme coverage (PhonCov). Analysis on English, Spanish, Italian, and Tamil show that PER benefits from the combination of mapping and alignment, PFER from alignment alone, and PhonCov from mapping. Further analyses demonstrate that the proposed framework captures clinically meaningful patterns of intelligibility degradation consistent with established observations of dysarthric speech.
Problem

Research questions and friction points this paper is trying to address.

dysarthria
intelligibility assessment
multilingual speech
phonemic contrast
language-specific factors
Innovation

Methods, ideas, or system contributions that make the work stand out.

universal phone recognition
language-specific phonemic contrast
phonological feature distance
alignment-free metric
multilingual dysarthria assessment
πŸ”Ž Similar Papers
No similar papers found.
E
Eunjung Yeo
Department of Computer Science, University of Texas at Austin, Austin, TX, USA
J
Julie Liss
College of Health Solutions, Arizona State University, Tempe, AZ, USA
Visar Berisha
Visar Berisha
Professor, College of Engineering and College of Health Solutions, Arizona State University
Speech and audio AIClinical speech analyticsMachine learningHealthcare AI
D
David R. Mortensen
Language Technologies Institute, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA