Tackling Cognitive Impairment Detection from Speech: A submission to the PROCESS Challenge

📅 2024-12-30
📈 Citations: 0
✹ Influential: 0
📄 PDF
đŸ€– AI Summary
This study addresses the PROCESS Challenge 2024, aiming to enable non-invasive, automatic detection of mild cognitive impairment (MCI) from clinician-guided spoken tasks—narration, repetition, and semantic generation—using both speech and transcribed text. We propose a novel multi-source complementary modeling framework that integrates acoustic pause features, LIWC-based linguistic statistics, macro-linguistic descriptors generated by large language models (LLMs), and multimodal neural representations (LongFormer for text; ECAPA-TDNN and TRILLsson for speech). These heterogeneous features are fused via an XGBoost/SVM ensemble classifier to achieve cross-modal discriminative learning. Our approach achieves state-of-the-art balanced performance on the three-class AD/MCI/HC classification task, significantly improving accuracy and generalizability across subtasks. The method establishes a new paradigm for early cognitive decline screening from naturalistic dialogue: interpretable, lightweight, and multimodal.

Technology Category

Application Category

📝 Abstract
This work describes our group's submission to the PROCESS Challenge 2024, with the goal of assessing cognitive decline through spontaneous speech, using three guided clinical tasks. This joint effort followed a holistic approach, encompassing both knowledge-based acoustic and text-based feature sets, as well as LLM-based macrolinguistic descriptors, pause-based acoustic biomarkers, and multiple neural representations (e.g., LongFormer, ECAPA-TDNN, and Trillson embeddings). Combining these feature sets with different classifiers resulted in a large pool of models, from which we selected those that provided the best balance between train, development, and individual class performance. Our results show that our best performing systems correspond to combinations of models that are complementary to each other, relying on acoustic and textual information from all three clinical tasks.
Problem

Research questions and friction points this paper is trying to address.

Brain Degeneration Assessment
Speech Analysis
Text Analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal Information Fusion
Advanced Computational Techniques
Brain Health Assessment
🔎 Similar Papers
No similar papers found.
Catarina Botelho
Catarina Botelho
Researcher at INESC-ID, Instituto Superior Técnico, University of Lisbon, Portugal
Machine learningSpeech processingMedical diagnosis
D
David Gimeno-G'omez
PRHLT research center, Universitat PolitĂšcnica de ValĂšncia, Spain
Francisco Teixeira
Francisco Teixeira
INESC-ID / IST, University of Lisbon
Speech ProcessingPrivacyMachine Learning
John Mendonça
John Mendonça
INESC-ID
Conversational AIDialogue SystemsLLMs
PatrĂ­cia Pereira
PatrĂ­cia Pereira
INESC-ID, Portugal; Instituto Superior Técnico, University of Lisbon, Portugal
Diogo A.P. Nunes
Diogo A.P. Nunes
Researcher @ INESC-ID | PhD Candidate @ Instituto Superior Técnico, Universidade de Lisboa
natural language processinghealthcaregenerative ai
Thomas Rolland
Thomas Rolland
INESC-ID
ASRPEFTSpeech ProcessingTTSMachine learming
A
A. Pompili
INESC-ID, Portugal; Instituto Superior Técnico, University of Lisbon, Portugal
R
Rubén Solera-Ureña
INESC-ID, Portugal; Instituto Superior Técnico, University of Lisbon, Portugal
M
Maria Ponte
INESC-ID, Portugal; Instituto Superior Técnico, University of Lisbon, Portugal
David Martins de Matos
David Martins de Matos
Associate Professor of Computer Science and Engineering, INESC ID, Instituto Superior Técnico
Information ExtractionMusic Information RetrievalNatural LanguageProgramming Languages and MethodologyAI and Robotics
Carlos-D. MartĂ­nez-Hinarejos
Carlos-D. MartĂ­nez-Hinarejos
Senior Lecturer (Titular de Universidad), PRHLT research center, Universitat PolitĂšcnica de ValĂšncia
Speech RecognitionHandwritten Text RecognitionMultimodalityDialogue SystemsPattern Recognition
Isabel Trancoso
Isabel Trancoso
INESC-ID / IST / University of Lisbon
Spoken Language Processing
A
Alberto Abad
INESC-ID, Portugal; Instituto Superior Técnico, University of Lisbon, Portugal