LLMCARE: Alzheimer's Detection via Transformer Models Enhanced by LLM-Generated Synthetic Data

📅 2025-08-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the scarcity of speech data for early screening of Alzheimer’s disease and related dementias (ADRD). Methodologically, it proposes a multimodal detection framework integrating speech-text analysis and synthetic data augmentation: (1) joint fine-tuning of ten Transformer models with 110 handcrafted linguistic features; (2) first-time use of clinically tuned large language models (MedAlpaca-7B, LLaMA, GPT-4o) to generate high-fidelity, label-conditioned synthetic speech transcripts; and (3) systematic evaluation of multimodal models—including GPT-4o, Qwen-Omni, and Phi-4—under zero-shot and fine-tuned settings. A key contribution is the empirical validation that distributional similarity between synthetic and real data critically determines performance gains. The fused model achieves F1 = 83.3 (AUC = 89.5); incorporating double the MedAlpaca-generated synthetic data raises F1 to 85.7. Fine-tuning MedAlpaca itself improves its F1 from 47.3 to 78.5, demonstrating the efficacy and feasibility of synthetic data augmentation in low-resource ADRD screening.

Technology Category

Application Category

📝 Abstract
Alzheimer's disease and related dementias (ADRD) affect approximately five million older adults in the U.S., yet over half remain undiagnosed. Speech-based natural language processing (NLP) offers a promising, scalable approach to detect early cognitive decline through linguistic markers. To develop and evaluate a screening pipeline that (i) fuses transformer embeddings with handcrafted linguistic features, (ii) tests data augmentation using synthetic speech generated by large language models (LLMs), and (iii) benchmarks unimodal and multimodal LLM classifiers for ADRD detection. Transcripts from the DementiaBank "cookie-theft" task (n = 237) were used. Ten transformer models were evaluated under three fine-tuning strategies. A fusion model combined embeddings from the top-performing transformer with 110 lexical-derived linguistic features. Five LLMs (LLaMA-8B/70B, MedAlpaca-7B, Ministral-8B, GPT-4o) were fine-tuned to generate label-conditioned synthetic speech, which was used to augment training data. Three multimodal models (GPT-4o, Qwen-Omni, Phi-4) were tested for speech-text classification in zero-shot and fine-tuned settings. The fusion model achieved F1 = 83.3 (AUC = 89.5), outperforming linguistic or transformer-only baselines. Augmenting training data with 2x MedAlpaca-7B synthetic speech increased F1 to 85.7. Fine-tuning significantly improved unimodal LLM classifiers (e.g., MedAlpaca: F1 = 47.3 -> 78.5 F1). Current multimodal models demonstrated lower performance (GPT-4o = 70.2 F1; Qwen = 66.0). Performance gains aligned with the distributional similarity between synthetic and real speech. Integrating transformer embeddings with linguistic features enhances ADRD detection from speech. Clinically tuned LLMs effectively support both classification and data augmentation, while further advancement is needed in multimodal modeling.
Problem

Research questions and friction points this paper is trying to address.

Detect Alzheimer's disease using speech-based NLP and linguistic markers
Enhance detection by combining transformer models with linguistic features
Improve accuracy using LLM-generated synthetic data for training augmentation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fuses transformer embeddings with linguistic features
Uses LLM-generated synthetic data for augmentation
Benchmarks multimodal LLM classifiers for detection
🔎 Similar Papers
No similar papers found.
Ali Zolnour
Ali Zolnour
School of Electrical and Computer Engineering, University of Tehran
Natural Language ProcessingSpeech RecognitionDeep LearningReinforcement Learning
Hossein Azadmaleki
Hossein Azadmaleki
School of Electrical and Computer Engineering, University of Tehran
Speech ProcessingNatural Language ProcessingHealthcare AICognitive Screening
Yasaman Haghbin
Yasaman Haghbin
University of Tehran
Machine learning - Generative AI - Natural Language Processing - Speech Analysis
F
Fatemeh Taherinezhad
Columbia University Irving Medical Center, New York, NY, United States
Mohamad Javad Momeni Nezhad
Mohamad Javad Momeni Nezhad
School of Electrical and Computer Engineering, University of Tehran
LLMNatural Language Processing
Sina Rashidi
Sina Rashidi
Computer Engineering Department, Sharif University of Technology
Speech ProcessingNatural Language ProcessingHealthcare AIAlzheimer's Disease
Masoud Khani
Masoud Khani
University of Wisconsin-Milwaukee
AIMachine learningXAIAdvance Predictive modelingClinical Data Science
A
AmirSajjad Taleban
University of Wisconsin-Milwaukee, Milwaukee, WI, United States
S
Samin Mahdizadeh Sani
School of Electrical and Computer Engineering, University of Tehran, Tehran, IRAN
M
Maryam Dadkhah
Columbia University Irving Medical Center, New York, NY, United States
J
James M. Noble
Department of Neurology, Taub Institute for Research on Alzheimer’s Disease and the Aging Brain, GH Sergievsky Center, Columbia University, New York, NY, United States
Suzanne Bakken
Suzanne Bakken
Columbia University
Biomedical informaticshealth disparities
Yadollah Yaghoobzadeh
Yadollah Yaghoobzadeh
University of Tehran / TeIAS
Natural Language ProcessingDeep Learning
Abdol-Hossein Vahabie
Abdol-Hossein Vahabie
School of ECE, & Faculty of Psychology, University of Tehran; School of Cognitive Sciences, IPM
Cognitive NeuroscienceNeuroeconomicsNeural DynamicsMachine LearningComputational Psychiatry
Masoud Rouhizadeh
Masoud Rouhizadeh
Assistant Professor, University of Florida
AI in HealthcareMedical InformaticsNatural Language ProcessingMachine LearningPopulation Health Informatics
Maryam Zolnoori
Maryam Zolnoori
Mayo Clinic | Columbia University
Text miningspeech analysismachine learningcognitive impairmentdrug effectiveness