๐ค AI Summary
Chronic disease progression exhibits substantial inter-individual heterogeneity, which conventional population-level, homogeneous modeling approaches fail to capture due to their inability to account for genetic and environmental drivers of divergent trajectories. To address this, we propose the first integrated framework jointly modeling genetic features and longitudinal clinical time seriesโmoving beyond static subtype assumptions to simultaneously discover genetically informed subtypes and infer personalized disease progression trajectories. Methodologically, our approach employs a variational autoencoder (VAE) to learn low-dimensional, interpretable genetic representations, coupled with an RNN-driven state-space model to characterize multi-stage clinical dynamics. Evaluated on both real-world and synthetic datasets, the model achieves a 12.3% absolute improvement in disease stage prediction accuracy, while yielding subtypes with enhanced biological interpretability and clinical coherence. This work establishes a novel paradigm for precision prognosis and targeted intervention.
๐ Abstract
Modeling disease progression through multiple stages is critical for clinical decision-making for chronic diseases, e.g., cancer, diabetes, chronic kidney diseases, and so on. Existing approaches often model the disease progression as a uniform trajectory pattern at the population level. However, chronic diseases are highly heterogeneous and often have multiple progression patterns depending on a patient's individual genetics and environmental effects due to lifestyles. We propose a personalized disease progression model to jointly learn the heterogeneous progression patterns and groups of genetic profiles. In particular, an end-to-end pipeline is designed to simultaneously infer the characteristics of patients from genetic markers using a variational autoencoder and how it drives the disease progressions using an RNN-based state-space model based on clinical observations. Our proposed model shows improvement on real-world and synthetic clinical data.