🤖 AI Summary
Early detection of Alzheimer’s disease (AD) is hindered by the challenge of jointly modeling macroscopic neuroanatomical alterations and microscopic genetic susceptibility signals. To address this, we propose the first interpretable multimodal large model that synergistically integrates structural MRI and single-nucleotide polymorphism (SNP) data. Our method introduces a novel region-of-interest (ROI)-guided visual tokenization scheme coupled with genetic prompting, enabling stage-specific neuro-genetic joint representation learning. It employs ROI-wise Vision Transformers, genetic text encoding, and cross-modal attention to achieve accurate four-stage classification (cognitively normal, subjective memory complaint, mild cognitive impairment, AD). Attention-based attribution further uncovers biologically meaningful ROI–gene associations. Evaluated on the ADNI cohort, our model achieves state-of-the-art performance, robustly recapitulates established GWAS risk genes (e.g., APOE, BIN1), and identifies stage-dependent pathological patterns: striatal involvement in SMC, frontotemporal atrophy in MCI, and whole-brain network disruption in AD.
📝 Abstract
Early detection of Alzheimer's disease (AD) requires models capable of integrating macro-scale neuroanatomical alterations with micro-scale genetic susceptibility, yet existing multimodal approaches struggle to align these heterogeneous signals. We introduce R-GenIMA, an interpretable multimodal large language model that couples a novel ROI-wise vision transformer with genetic prompting to jointly model structural MRI and single nucleotide polymorphisms (SNPs) variations. By representing each anatomically parcellated brain region as a visual token and encoding SNP profiles as structured text, the framework enables cross-modal attention that links regional atrophy patterns to underlying genetic factors. Applied to the ADNI cohort, R-GenIMA achieves state-of-the-art performance in four-way classification across normal cognition (NC), subjective memory concerns (SMC), mild cognitive impairment (MCI), and AD. Beyond predictive accuracy, the model yields biologically meaningful explanations by identifying stage-specific brain regions and gene signatures, as well as coherent ROI-Gene association patterns across the disease continuum. Attention-based attribution revealed genes consistently enriched for established GWAS-supported AD risk loci, including APOE, BIN1, CLU, and RBFOX1. Stage-resolved neuroanatomical signatures identified shared vulnerability hubs across disease stages alongside stage-specific patterns: striatal involvement in subjective decline, frontotemporal engagement during prodromal impairment, and consolidated multimodal network disruption in AD. These results demonstrate that interpretable multimodal AI can synthesize imaging and genetics to reveal mechanistic insights, providing a foundation for clinically deployable tools that enable earlier risk stratification and inform precision therapeutic strategies in Alzheimer's disease.