🤖 AI Summary
Despite growing interest in natural language processing (NLP) for dementia research, the field lacks a unified, interdisciplinary synthesis of methodologies, gaps, and translational pathways. Method: We systematically reviewed over 240 publications, applying bibliometric and thematic analysis to map applications across four domains—dementia detection, linguistic biomarker extraction, caregiver support, and patient assistance—and to characterize technical approaches including speech-to-text transcription, text mining, electronic health record parsing, social media modeling, and synthetic data generation. Contribution/Results: We identify seven data modalities and five underexplored directions—including human-degraded language models, digital twins, and synthetic-data-driven paradigms. Critically, we integrate perspectives from medicine, NLP, and engineering to diagnose key bottlenecks: deficient trust mechanisms, insufficient scientific rigor, and weak cross-domain collaboration. We propose an ethics-aligned framework and multimodal data governance guidelines, delivering the first comprehensive roadmap toward clinically translatable, trustworthy, and high-impact NLP-dementia research.
📝 Abstract
The close link between cognitive decline and language has fostered long-standing collaboration between the NLP and medical communities in dementia research. To examine this, we reviewed over 240 papers applying NLP to dementia-related efforts, drawing from medical, technological, and NLP-focused literature. We identify key research areas, including dementia detection, linguistic biomarker extraction, caregiver support, and patient assistance, showing that half of all papers focus solely on dementia detection using clinical data. Yet, many directions remain unexplored -- artificially degraded language models, synthetic data, digital twins, and more. We highlight gaps and opportunities around trust, scientific rigor, applicability and cross-community collaboration. We raise ethical dilemmas in the field, and highlight the diverse datasets encountered throughout our review -- recorded, written, structured, spontaneous, synthetic, clinical, social media-based, and more. This review aims to inspire more creative, impactful, and rigorous research on NLP for dementia.