🤖 AI Summary
This work addresses the limitation of existing disease progression models, which typically support only a single data type and struggle to handle the heterogeneous mix of discrete and continuous biomarkers commonly encountered in real-world clinical settings. To overcome this challenge, the authors propose a Mixed Events model and integrate it into the Bayesian nonparametric framework SuStaIn, yielding Mixed-SuStaIn—the first approach capable of unified modeling of mixed data types. The method simultaneously identifies disease subtypes and infers personalized progression trajectories. Evaluated on both synthetic data and real-world data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI), Mixed-SuStaIn accurately reconstructs disease progression and significantly enhances model applicability to heterogeneous clinical data. The implementation has been made publicly available.
📝 Abstract
Disease progression modeling provides a robust framework to identify long-term disease trajectories from short-term biomarker data. It is a valuable tool to gain a deeper understanding of diseases with a long disease trajectory, such as Alzheimer's disease. A key limitation of most disease progression models is that they are specific to a single data type (e.g., continuous data), thereby limiting their applicability to heterogeneous, real-world datasets. To address this limitation, we propose the Mixed Events model, a novel disease progression model that handles both discrete and continuous data types. This model is implemented within the Subtype and Stage Inference (SuStaIn) framework, resulting in Mixed-SuStaIn, enabling subtype and progression modeling. We demonstrate the effectiveness of Mixed-SuStaIn through simulation experiments and real-world data from the Alzheimer's Disease Neuroimaging Initiative, showing that it performs well on mixed datasets. The code is available at: https://github.com/ucl-pond/pySuStaIn.