Disease Progression and Subtype Modeling for Combined Discrete and Continuous Input Data

📅 2026-02-25

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

This work addresses the limitation of existing disease progression models, which typically support only a single data type and struggle to handle the heterogeneous mix of discrete and continuous biomarkers commonly encountered in real-world clinical settings. To overcome this challenge, the authors propose a Mixed Events model and integrate it into the Bayesian nonparametric framework SuStaIn, yielding Mixed-SuStaIn—the first approach capable of unified modeling of mixed data types. The method simultaneously identifies disease subtypes and infers personalized progression trajectories. Evaluated on both synthetic data and real-world data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI), Mixed-SuStaIn accurately reconstructs disease progression and significantly enhances model applicability to heterogeneous clinical data. The implementation has been made publicly available.

Technology Category

Application Category

📝 Abstract

Disease progression modeling provides a robust framework to identify long-term disease trajectories from short-term biomarker data. It is a valuable tool to gain a deeper understanding of diseases with a long disease trajectory, such as Alzheimer's disease. A key limitation of most disease progression models is that they are specific to a single data type (e.g., continuous data), thereby limiting their applicability to heterogeneous, real-world datasets. To address this limitation, we propose the Mixed Events model, a novel disease progression model that handles both discrete and continuous data types. This model is implemented within the Subtype and Stage Inference (SuStaIn) framework, resulting in Mixed-SuStaIn, enabling subtype and progression modeling. We demonstrate the effectiveness of Mixed-SuStaIn through simulation experiments and real-world data from the Alzheimer's Disease Neuroimaging Initiative, showing that it performs well on mixed datasets. The code is available at: https://github.com/ucl-pond/pySuStaIn.

Problem

Research questions and friction points this paper is trying to address.

disease progression modeling

mixed data types

discrete and continuous data

disease subtypes

heterogeneous data

Innovation

Methods, ideas, or system contributions that make the work stand out.

disease progression modeling

mixed data types

discrete and continuous data