🤖 AI Summary
This study addresses the challenge of remote, contactless psychological and cognitive assessment in older adults. We propose a multimodal analytical framework integrating facial expressions, speech acoustics, linguistic features, and remote photoplethysmography (rPPG)-derived cardiovascular signals. For the first time, we jointly model rPPG-based heart rate variability (HRV) features with facial action units, acoustic prosody, and language representations, revealing modality-specific contributions: speech features predominantly drive cognitive impairment detection, while facial + rPPG features jointly dominate discrimination of psychological traits—including social isolation, neuroticism, and negative affect. Evaluated on real-world remote video dialogue data, our model achieves AUCs of 0.78 for detecting CDR 0.5-level mild cognitive impairment, 0.75 for social isolation, 0.71 for neuroticism, and 0.79 for negative affect. These results demonstrate the feasibility and clinical potential of integrated, multimodal psychological–cognitive assessment in remote settings.
📝 Abstract
The aging society urgently requires scalable methods to monitor cognitive decline and identify social and psychological factors indicative of dementia risk in older adults. Our machine learning (ML) models captured facial, acoustic, linguistic, and cardiovascular features from 39 individuals with normal cognition or Mild Cognitive Impairment derived from remote video conversations and classified cognitive status, social isolation, neuroticism, and psychological well-being. Our model could distinguish Clinical Dementia Rating Scale (CDR) of 0.5 (vs. 0) with 0.78 area under the receiver operating characteristic curve (AUC), social isolation with 0.75 AUC, neuroticism with 0.71 AUC, and negative affect scales with 0.79 AUC. Recent advances in machine learning offer new opportunities to remotely detect cognitive impairment and assess associated factors, such as neuroticism and psychological well-being. Our experiment showed that speech and language patterns were more useful for quantifying cognitive impairment, whereas facial expression and cardiovascular patterns using photoplethysmography (PPG) were more useful for quantifying personality and psychological well-being.