🤖 AI Summary
To address the weak expressiveness and frequent suppression of genetic features by dominant imaging modalities in multimodal Alzheimer’s disease (AD) analysis, this paper proposes a genetics–imaging bidirectional collaborative fusion framework. We introduce a novel dynamic primary–auxiliary modality role-switching mechanism and a SNP genomic spatial encoding module, integrated with multi-instance attention and contrastive self-distillation to enable interpretable cross-modal risk quantification and complementary feature refinement. Evaluated on the ADNI dataset, our model achieves state-of-the-art performance. It successfully identifies 12 AD-associated high-risk genes—including the well-established APOE and several novel candidate genes—demonstrating both high predictive accuracy and strong biological interpretability. This work establishes a new paradigm for elucidating AD etiology and enabling precision prediction through synergistic multimodal integration.
📝 Abstract
Recent studies have shown that integrating multimodal data fusion techniques for imaging and genetic features is beneficial for the etiological analysis and predictive diagnosis of Alzheimer's disease (AD). However, there are several critical flaws in current deep learning methods. Firstly, there has been insufficient discussion and exploration regarding the selection and encoding of genetic information. Secondly, due to the significantly superior classification value of AD imaging features compared to genetic features, many studies in multimodal fusion emphasize the strengths of imaging features, actively mitigating the influence of weaker features, thereby diminishing the learning of the unique value of genetic features. To address this issue, this study proposes the dynamic multimodal role-swapping network (GenDMR). In GenDMR, we develop a novel approach to encode the spatial organization of single nucleotide polymorphisms (SNPs), enhancing the representation of their genomic context. Additionally, to adaptively quantify the disease risk of SNPs and brain region, we propose a multi-instance attention module to enhance model interpretability. Furthermore, we introduce a dominant modality selection module and a contrastive self-distillation module, combining them to achieve a dynamic teacher-student role exchange mechanism based on dominant and auxiliary modalities for bidirectional co-updating of different modal data. Finally, GenDMR achieves state-of-the-art performance on the ADNI public dataset and visualizes attention to different SNPs, focusing on confirming 12 potential high-risk genes related to AD, including the most classic APOE and recently highlighted significant risk genes. This demonstrates GenDMR's interpretable analytical capability in exploring AD genetic features, providing new insights and perspectives for the development of multimodal data fusion techniques.