🤖 AI Summary
This study addresses the low automation and poor model scalability in early dementia screening via speech analysis. We propose Demenba, a state-space model (SSM)-based framework that directly models long-range cognitive features from neuropsychological test audio recordings—bypassing reliance on automatic speech recognition (ASR) or handcrafted features. Evaluated on over 1,000 hours of multicenter cognitive assessment data, Demenba achieves a 21% improvement in fine-grained dementia classification accuracy while reducing parameter count by 37%. It exhibits linear time and memory complexity, enabling efficient scaling. Its modular architecture natively integrates with large language models (LLMs), facilitating clinical interpretability and cross-task generalization. Collectively, Demenba advances automated, scalable, and interpretable speech-based dementia screening.
📝 Abstract
Early detection of dementia is critical for timely medical intervention and improved patient outcomes. Neuropsychological tests are widely used for cognitive assessment but have traditionally relied on manual scoring. Automatic dementia classification (ADC) systems aim to infer cognitive decline directly from speech recordings of such tests. We propose Demenba, a novel ADC framework based on state space models, which scale linearly in memory and computation with sequence length. Trained on over 1,000 hours of cognitive assessments administered to Framingham Heart Study participants, some of whom were diagnosed with dementia through adjudicated review, our method outperforms prior approaches in fine-grained dementia classification by 21%, while using fewer parameters. We further analyze its scaling behavior and demonstrate that our model gains additional improvement when fused with large language models, paving the way for more transparent and scalable dementia assessment tools. Code: https://anonymous.4open.science/r/Demenba-0861