🤖 AI Summary
To address the urgent need for early Alzheimer’s disease (AD) screening in aging societies—where existing speech-based detection methods suffer from limited performance—this paper proposes a novel AD speech detection framework integrating automatic speech recognition (ASR) and large language models (LLMs). The core contribution is the first explicit incorporation of chain-of-thought (CoT) reasoning into speech-driven AD diagnosis: ASR transcribes spoken utterances into text, which is then processed by an LLM guided by CoT prompting to generate interpretable binary classification decisions. A task-specific linear layer and supervised fine-tuning further optimize model behavior. Experiments demonstrate a 16.7% relative accuracy improvement over the non-CoT baseline, achieving state-of-the-art (SOTA) performance on the AD speech detection task. Crucially, the framework delivers both superior predictive performance and clinically meaningful interpretability through transparent, step-by-step reasoning.
📝 Abstract
Societies worldwide are rapidly entering a super-aged era, making elderly health a pressing concern. The aging population is increasing the burden on national economies and households. Dementia cases are rising significantly with this demographic shift. Recent research using voice-based models and large language models (LLM) offers new possibilities for dementia diagnosis and treatment. Our Chain-of-Thought (CoT) reasoning method combines speech and language models. The process starts with automatic speech recognition to convert speech to text. We add a linear layer to an LLM for Alzheimer's disease (AD) and non-AD classification, using supervised fine-tuning (SFT) with CoT reasoning and cues. This approach showed an 16.7% relative performance improvement compared to methods without CoT prompt reasoning. To the best of our knowledge, our proposed method achieved state-of-the-art performance in CoT approaches.