🤖 AI Summary
Existing methods for modeling long-term progression of neurodegenerative diseases face two key limitations: (1) oversimplified brain connectivity representations—relying solely on unimodal connectomes—lead to inaccurate prediction of pathological spread; and (2) purely data-driven models lack biological constraints, resulting in poor identifiability and limited capacity to decipher dynamic interactions among biomarkers. To address these, we propose LLM-GNN, a novel framework that, for the first time, integrates large language models (LLMs) as sources of prior biological knowledge to guide multimodal, biologically plausible dynamic graph-structure learning. Our method jointly optimizes personalized disease trajectories and region-level interaction graphs by incorporating longitudinal tau-PET data. Evaluated on an Alzheimer’s disease cohort, LLM-GNN significantly improves pathological spread prediction accuracy, uncovers novel disease pathways beyond canonical structural connectivity, and enhances model interpretability and mechanistic traceability.
📝 Abstract
Understanding the interactions between biomarkers among brain regions during neurodegenerative disease is essential for unravelling the mechanisms underlying disease progression. For example, pathophysiological models of Alzheimer's Disease (AD) typically describe how variables, such as regional levels of toxic proteins, interact spatiotemporally within a dynamical system driven by an underlying biological substrate, often based on brain connectivity. However, current methods grossly oversimplify the complex relationship between brain connectivity by assuming a single-modality brain connectome as the disease-spreading substrate. This leads to inaccurate predictions of pathology spread, especially during the long-term progression period. Meanhwile, other methods of learning such a graph in a purely data-driven way face the identifiability issue due to lack of proper constraint. We thus present a novel framework that uses Large Language Models (LLMs) as expert guides on the interaction of regional variables to enhance learning of disease progression from irregularly sampled longitudinal patient data. By leveraging LLMs'ability to synthesize multi-modal relationships and incorporate diverse disease-driving mechanisms, our method simultaneously optimizes 1) the construction of long-term disease trajectories from individual-level observations and 2) the biologically-constrained graph structure that captures interactions among brain regions with better identifiability. We demonstrate the new approach by estimating the pathology propagation using tau-PET imaging data from an Alzheimer's disease cohort. The new framework demonstrates superior prediction accuracy and interpretability compared to traditional approaches while revealing additional disease-driving factors beyond conventional connectivity measures.