The order in speech disorder: a scoping review of state of the art machine learning methods for clinical speech classification

📅 2025-03-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study systematically reviews 91 machine learning (ML) studies leveraging speech for disease diagnosis, focusing on neurological (e.g., Parkinson’s disease, Alzheimer’s disease), laryngeal, and psychiatric disorders (e.g., depression, OCD, autism spectrum disorder). To address the lack of standardized evaluation, we propose the first cross-disorder ML-based speech diagnostic method scoring framework (0–10 scale), quantifying diagnostic potential gradients and research gaps across acoustic, prosodic, and linguistic feature dimensions. Supervised models—including SVM, random forests, CNNs, and LSTMs—achieve >90% accuracy for laryngeal disorders and Parkinson’s-related dysarthria, and 75–95% for depression and Alzheimer’s disease; however, performance remains limited and insufficiently validated for OCD and autism. Our key contribution is a rigorous, comparable assessment framework that reveals both disease-specificity of speech biomarkers and critical bottlenecks in clinical translation.

Technology Category

Application Category

📝 Abstract
Background:Speech patterns have emerged as potential diagnostic markers for conditions with varying etiologies. Machine learning (ML) presents an opportunity to harness these patterns for accurate disease diagnosis. Objective: This review synthesized findings from studies exploring ML's capability in leveraging speech for the diagnosis of neurological, laryngeal and mental disorders. Methods: A systematic examination of 564 articles was conducted with 91 articles included in the study, which encompassed a wide spectrum of conditions, ranging from voice pathologies to mental and neurological disorders. Methods for speech classifications were assessed based on the relevant studies and scored between 0-10 based on the reported diagnostic accuracy of their ML models. Results: High diagnostic accuracies were consistently observed for laryngeal disorders, dysarthria, and changes related to speech in Parkinsons disease. These findings indicate the robust potential of speech as a diagnostic tool. Disorders like depression, schizophrenia, mild cognitive impairment and Alzheimers dementia also demonstrated high accuracies, albeit with some variability across studies. Meanwhile, disorders like OCD and autism highlighted the need for more extensive research to ascertain the relationship between speech patterns and the respective conditions. Conclusion: ML models utilizing speech patterns demonstrate promising potential in diagnosing a range of mental, laryngeal, and neurological disorders. However, the efficacy varies across conditions, and further research is needed. The integration of these models into clinical practice could potentially revolutionize the evaluation and diagnosis of a number of different medical conditions.
Problem

Research questions and friction points this paper is trying to address.

Machine learning for speech-based diagnosis of disorders.
Review of ML methods for neurological, laryngeal, mental disorders.
Assessing diagnostic accuracy of speech classification models.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Machine learning analyzes speech for disease diagnosis.
High accuracy in diagnosing laryngeal and neurological disorders.
Speech patterns as diagnostic markers for mental disorders.
🔎 Similar Papers
No similar papers found.
B
Birger Moell
KTH Speech, Music and Hearing
Fredrik Sand Aronsson
Fredrik Sand Aronsson
PhD student, Karolinska institutet
Machine learningspeech and language impairments in neurodegenerative disorders
P
Per Ostberg
Theme Women’s Health and Allied Health Professionals, Unit of Speech and Language Pathology, Karolinska University Hospital; Division of Speech and Language Pathology, Department of Clinical Science, Intervention and Technology, Karolinska Institutet
Jonas Beskow
Jonas Beskow
Professor, KTH Speech, Music and Hearing
multimodal interactionsocial roboticsspeech synthesismotion synthesissign language processing