Geometric Analysis of Speech Representation Spaces: Topological Disentanglement and Confound Detection

📅 2026-02-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the ambiguity in the separability of pathological speech and accent variation within representation spaces for multilingual voice-based health assessment, a challenge that can lead to misdiagnosis or missed diagnosis. The work proposes the first systematic framework to quantify the geometric disentanglement of affective, linguistic, and pathological attributes in embedding spaces across six corpora, leveraging four clustering metrics—including the Silhouette coefficient—permutation tests, and confidence analyses. Results reveal that affective features form the most compact clusters (0.250), followed by pathological (0.141) and linguistic features (0.077). Critically, the entanglement between pathology and language remains below 0.21, satisfying the fairness threshold required for clinical deployment. This research establishes actionable criteria for representation disentanglement and provides fairness guarantees for cross-lingual speech health systems.

Technology Category

Application Category

📝 Abstract
Speech-based clinical tools are increasingly deployed in multilingual settings, yet whether pathological speech markers remain geometrically separable from accent variation remains unclear. Systems may misclassify healthy non-native speakers or miss pathology in multilingual patients. We propose a four-metric clustering framework to evaluate geometric disentanglement of emotional, linguistic, and pathological speech features across six corpora and eight dataset combinations. A consistent hierarchy emerges: emotional features form the tightest clusters (Silhouette 0.250), followed by pathological (0.141) and linguistic (0.077). Confound analysis shows pathological-linguistic overlap remains below 0.21, which is above the permutation null but bounded for clinical deployment. Trustworthiness analysis confirms embedding fidelity and robustness of the geometric conclusions. Our framework provides actionable guidelines for equitable and reliable speech health systems across diverse populations.
Problem

Research questions and friction points this paper is trying to address.

speech pathology
accent variation
geometric disentanglement
confound detection
multilingual speech
Innovation

Methods, ideas, or system contributions that make the work stand out.

geometric disentanglement
confound detection
speech representation
clustering framework
multilingual clinical speech
🔎 Similar Papers
No similar papers found.
B
Bipasha Kashyap
Networked Sensing & Biomedical Engineering (NSBE) Research lab, School of Engineering, Deakin University, Australia
Pubudu N. Pathirana
Pubudu N. Pathirana
Professor, Head of Discipline, Mechatronics, E&E Engineering, Deakin University
Human Motion CaptureAssistive Device DesignComputer NetworksMachine Learning