🤖 AI Summary
Diagnosing neurodegenerative diseases (e.g., Alzheimer’s disease) under few-shot or zero-shot settings remains challenging due to limited annotated data and poor model interpretability. Method: We propose the first retrieval-augmented, evidence-guided multimodal reasoning framework centered on brain MRI, integrated with clinical text evidence. It employs contrastive learning to align visual–textual representations, introduces pseudo-text modalities (e.g., abnormality types, diagnostic labels, clinical descriptions), incorporates a retrieval-enhanced evidence encoding module, and adopts an attention-driven reasoning head to emulate clinician-like decision-making. Contribution/Results: The method requires minimal labeled data yet generates both accurate diagnostic predictions and clinically aligned, interpretable reports—complete with supporting reference images and pathologically grounded reasoning logic. Experiments demonstrate significant performance gains over state-of-the-art methods under ultra-low-label regimes, achieving high accuracy and strong clinical credibility. This work establishes a robust, interpretable paradigm for real-world neuroimaging-assisted diagnosis.
📝 Abstract
Timely and accurate diagnosis of neurodegenerative disorders, such as Alzheimer's disease, is central to disease management. Existing deep learning models require large-scale annotated datasets and often function as"black boxes". Additionally, datasets in clinical practice are frequently small or unlabeled, restricting the full potential of deep learning methods. Here, we introduce REMEMBER -- Retrieval-based Explainable Multimodal Evidence-guided Modeling for Brain Evaluation and Reasoning -- a new machine learning framework that facilitates zero- and few-shot Alzheimer's diagnosis using brain MRI scans through a reference-based reasoning process. Specifically, REMEMBER first trains a contrastively aligned vision-text model using expert-annotated reference data and extends pseudo-text modalities that encode abnormality types, diagnosis labels, and composite clinical descriptions. Then, at inference time, REMEMBER retrieves similar, human-validated cases from a curated dataset and integrates their contextual information through a dedicated evidence encoding module and attention-based inference head. Such an evidence-guided design enables REMEMBER to imitate real-world clinical decision-making process by grounding predictions in retrieved imaging and textual context. Specifically, REMEMBER outputs diagnostic predictions alongside an interpretable report, including reference images and explanations aligned with clinical workflows. Experimental results demonstrate that REMEMBER achieves robust zero- and few-shot performance and offers a powerful and explainable framework to neuroimaging-based diagnosis in the real world, especially under limited data.