🤖 AI Summary
Early diagnosis of Alzheimer’s disease (AD) is hindered by confounding biases—such as age and imaging artifacts—in multimodal neuroimaging data, as well as complex inter-modal dependencies. To address this, we propose a vision–language cross-modal causal intervention framework that jointly leverages structural MRI, functional MRI, and structured clinical text generated by large language models (LLMs). Our method implicitly enforces causal intervention during feature learning to disentangle confounders from disease-relevant signals. This enhances both robustness and interpretability in classifying cognitively normal (CN), mild cognitive impairment (MCI), and AD subjects. Evaluated on multiple public benchmarks, our model achieves state-of-the-art performance in accuracy, F1-score, and other key metrics. Notably, it provides the first systematic empirical validation of causal intervention for multimodal neurodegenerative disease diagnosis, demonstrating both efficacy and generalizability across diverse datasets and clinical scenarios.
📝 Abstract
Mild Cognitive Impairment (MCI) serves as a prodromal stage of Alzheimer's Disease (AD), where early identification and intervention can effectively slow the progression to dementia. However, diagnosing AD remains a significant challenge in neurology due to the confounders caused mainly by the selection bias of multimodal data and the complex relationships between variables. To address these issues, we propose a novel visual-language causal intervention framework named Alzheimer's Disease Prediction with Cross-modal Causal Intervention (ADPC) for diagnostic assistance. Our ADPC employs large language model (LLM) to summarize clinical data under strict templates, maintaining structured text outputs even with incomplete or unevenly distributed datasets. The ADPC model utilizes Magnetic Resonance Imaging (MRI), functional MRI (fMRI) images and textual data generated by LLM to classify participants into Cognitively Normal (CN), MCI, and AD categories. Because of the presence of confounders, such as neuroimaging artifacts and age-related biomarkers, non-causal models are likely to capture spurious input-output correlations, generating less reliable results. Our framework implicitly eliminates confounders through causal intervention. Experimental results demonstrate the outstanding performance of our method in distinguishing CN/MCI/AD cases, achieving state-of-the-art (SOTA) metrics across most evaluation metrics. The study showcases the potential of integrating causal reasoning with multi-modal learning for neurological disease diagnosis.