A regularized multi-state model for covariate selection with interval-censored survival data

📅 2025-08-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address diagnostic uncertainty under interval censoring and missing diagnoses due to semi-competing risks of death, this paper proposes a regularized multi-state disease–death model for high-dimensional covariates. We innovatively introduce transition-specific elastic net penalties and develop a proximal gradient hybrid algorithm that jointly estimates multiple transition intensities and performs variable selection. Under the proportional transition intensity assumption, penalty parameters are optimized via outer-layer grid search. The method is implemented in the R package HIDeM. Simulation studies demonstrate its superior performance over existing approaches in both variable selection accuracy and disease probability prediction. Applied to the Three-City cohort, it successfully identifies key dementia predictors across neuroimaging, cognitive, and clinical domains. This work provides a scalable statistical framework for high-dimensional longitudinal cohort studies with semi-competing risks and interval-censored outcomes.

Technology Category

Application Category

📝 Abstract
In population-based cohorts, disease diagnoses are typically censored by intervals as made during scheduled follow-up visits. The exact disease onset time is thus unknown, and in the presence of semi-competing risk of death, subjects may also die in between two visits before any diagnosis can be made. Illness-death models can be used to handle uncertainty about illness timing and the possible absence of diagnosis due to death. However, they are so far limited in the number of covariates. We developed a regularized estimation procedure for illness-death models with interval-censored illness diagnosis that performs variable selection in the case of high-dimensional predictors. We considered a proximal gradient hybrid algorithm maximizing the regularized likelihood with an elastic-net penalty. The algorithm simultaneously estimates the regression parameters of the three transitions under proportional transition intensities with transition-specific penalty parameters determined in an outer gridsearch. The algorithm, implemented in the R package HIDeM, shows high performances in predicting illness probability, as well as correct selection of transition-specific risk factors across different simulation scenarios. In comparison, the cause-specific competing risk model neglecting interval-censoring systematically showed worse predictive ability and tended to select irrelevant illness predictors, originally associated with death. Applied to the population-based cohort Three-City, the method identified predictors of clinical dementia onset among a large set of brain imaging, cognitive and clinical markers. Keywords: Interval censoring; Multi-state model; Semi-competing risk; Survival Analysis; Variable Selection.
Problem

Research questions and friction points this paper is trying to address.

Handles interval-censored disease onset with semi-competing death risks
Selects relevant predictors in high-dimensional illness-death models
Addresses limitations of traditional models ignoring interval-censoring and death
Innovation

Methods, ideas, or system contributions that make the work stand out.

Regularized estimation for illness-death models
Proximal gradient hybrid algorithm with elastic-net
Simultaneous estimation of three transition parameters
🔎 Similar Papers
No similar papers found.
A
Ariane Bercu
Univ. Bordeaux, INSERM, BPH, U1219, F-33000 Bordeaux, France
Agathe Guilloux
Agathe Guilloux
INRIA HeKA
statisticsstatistical learningbiostatisticssurvival analysis
C
Cécile Proust-Lima
Univ. Bordeaux, INSERM, BPH, U1219, F-33000 Bordeaux, France
H
Hélène Jacqmin-Gadda
Univ. Bordeaux, INSERM, BPH, U1219, F-33000 Bordeaux, France