🤖 AI Summary
This study identifies, for the first time, a systematic gender bias in auditory perception of Alzheimer’s disease (AD): male voices are significantly more likely to be misclassified as AD-positive than female voices—across both Mandarin Chinese and Greek bilingual listening experiments—indicating language-invariant bias. Method: We combined perceptual evaluation with acoustic analysis, extracting features including fundamental frequency (F0), shimmer, and utterance duration. Contribution/Results: We empirically demonstrate that shimmer positively correlates with AD perception, whereas utterance duration negatively correlates; critically, higher shimmer in male voices partially explains the cross-linguistic gender bias. This is the first evidence of stable, language-independent gender bias in AD voice detection rooted in sex-linked acoustic properties. Our findings underscore that robust AD speech screening models must explicitly model and correct for gender confounding effects—providing key methodological guidance for developing fair and generalizable medical AI systems.
📝 Abstract
Gender bias has been widely observed in speech perception tasks, influenced by the fundamental voicing differences between genders. This study reveals a gender bias in the perception of Alzheimer's Disease (AD) speech. In a perception experiment involving 16 Chinese listeners evaluating both Chinese and Greek speech, we identified that male speech was more frequently identified as AD, with this bias being particularly pronounced in Chinese speech. Acoustic analysis showed that shimmer values in male speech were significantly associated with AD perception, while speech portion exhibited a significant negative correlation with AD identification. Although language did not have a significant impact on AD perception, our findings underscore the critical role of gender bias in AD speech perception. This work highlights the necessity of addressing gender bias when developing AD detection models and calls for further research to validate model performance across different linguistic contexts.