🤖 AI Summary
This work addresses domain shift, partial label mismatch, and gender bias in cross-device, cross-cohort speech data for classifying Parkinson’s disease (PD) and amyotrophic lateral sclerosis (ALS). The authors propose a hybrid framework integrating style transfer with conditional adversarial alignment to learn gender-invariant and domain-generalized speech representations. By leveraging conditional adversarial domain alignment and adversarial gender disentanglement, the model enables robust three-class classification of healthy, PD, and ALS speech across domains. The study establishes the first unified cross-cohort benchmark that jointly accounts for partially overlapping labels and fairness constraints. Evaluated on four heterogeneous vowel datasets, the method significantly outperforms twelve state-of-the-art approaches under both domain generalization and unsupervised domain adaptation settings, while effectively mitigating gender disparities and enhancing model generalizability and fairness.
📝 Abstract
Voice-based digital biomarkers can enable scalable, non-invasive screening and monitoring of Parkinson's disease (PD) and Amyotrophic Lateral Sclerosis (ALS). However, models trained on one cohort or device often fail on new acquisition settings due to cross-device and cross-cohort domain shift. This challenge is amplified in real-world scenarios with partial-label mismatch, where datasets may contain different disease labels and only partially overlap in class space. In addition, voice-based models may exploit demographic cues, raising concerns about gender-related unfairness, particularly when deployed across heterogeneous cohorts. To tackle these challenges, we propose a hybrid framework for unified three-class (healthy/PD/ALS) cross-domain voice classification from partially overlapping cohorts. The method combines style-based domain generalization with conditional adversarial alignment tailored to partial-label settings, reducing negative transfer. An additional adversarial gender branch promotes gender-invariant representations. We conduct a comprehensive evaluation across four heterogeneous sustained-vowel datasets, spanning distinct acquisition settings and devices, under both domain generalization and unsupervised domain adaptation protocols. The proposed approach is compared against twelve state-of-the-art machine learning and deep learning methods, and further evaluated through three targeted ablations, providing the first cross-cohort benchmark and end-to-end domain-adaptive framework for unified healthy/PD/ALS voice classification under partial-label mismatch and fairness constraints. Across all experimental settings, our method consistently achieves the best external generalization over the considered evaluation metrics, while maintaining reduced gender disparities. Notably, no competing method shows statistically significant gains in external performance.