🤖 AI Summary
This study investigates the influence of linguistic factors on the performance of speech-based early detection of Parkinson’s disease (PD). We systematically evaluate PD classification accuracy using multilingual and English-only Whisper models, self-supervised speech models (wav2vec 2.0, data2vec), and AudioSet-pretrained models on two speech modalities: sustained vowels and spontaneous speech. Results show that text-only models (e.g., BERT) achieve performance comparable to acoustic models; multilingual Whisper substantially outperforms monolingual and self-supervised alternatives; and gains from audio pretraining are task-dependent. Crucially, this work provides the first empirical evidence that linguistic representations—particularly cross-lingual semantic consistency—play a pivotal role in extracting PD-relevant speech biomarkers. It demonstrates that language-level information, independent of fine-grained acoustic features, yields discriminative cues for PD detection. These findings establish a novel, contactless, cross-lingual screening paradigm for PD, especially viable in low-resource language settings.
📝 Abstract
Using speech samples as a biomarker is a promising avenue for detecting and monitoring the progression of Parkinson's disease (PD), but there is considerable disagreement in the literature about how best to collect and analyze such data. Early research in detecting PD from speech used a sustained vowel phonation (SVP) task, while some recent research has explored recordings of more cognitively demanding tasks. To assess the role of language in PD detection, we tested pretrained models with varying data types and pretraining objectives and found that (1) text-only models match the performance of vocal-feature models, (2) multilingual Whisper outperforms self-supervised models whereas monolingual Whisper does worse, and (3) AudioSet pretraining improves performance on SVP but not spontaneous speech. These findings together highlight the critical role of language for the early detection of Parkinson's disease.