🤖 AI Summary
In clinical practice, disease onset times are often only known to lie within follow-up intervals (interval censoring), yet existing illness–death three-state models frequently ignore this feature, leading to biased evaluation of discrimination performance (e.g., time-dependent AUC). This study first systematically demonstrates that interval censoring induces substantial underestimation of time-specific AUC. We establish the necessity of jointly accommodating interval-censored structures both in model fitting and in discrimination assessment. Using simulations and real-world soft-tissue sarcoma data, we evaluate four approaches: Weibull parametric, M-spline–smoothed hazards, piecewise-constant hazards (msm), and a naïve time-dependent Cox model ignoring censoring. Results show that ignoring interval censoring underestimates dynamic AUC by over 12% on average; appropriately accounting for it markedly improves estimation accuracy. Our work provides a methodological benchmark for robust discrimination evaluation of high-dimensional longitudinal prediction models under interval censoring.
📝 Abstract
In clinical studies, the illness-death model is often used to describe disease progression. A subject starts disease-free, may develop the disease and then die, or die directly. In clinical practice, disease can only be diagnosed at pre-specified follow-up visits, so the exact time of disease onset is often unknown, resulting in interval-censored data. This study examines the impact of ignoring this interval-censored nature of disease data on the discrimination performance of illness-death models, focusing on the time-specific Area Under the receiver operating characteristic Curve (AUC) in both incident/dynamic and cumulative/dynamic definitions. A simulation study with data simulated from Weibull transition hazards and disease state censored at regular intervals is conducted. Estimates are derived using different methods: the Cox model with a time-dependent binary disease marker, which ignores interval-censoring, and the illness-death model for interval-censored data estimated with three implementations - the piecewise-constant model from the msm package, the Weibull and M-spline models from the SmoothHazard package. These methods are also applied to a dataset of 2232 patients with high-grade soft tissue sarcoma, where the interval-censored disease state is the post-operative development of distant metastases. The results suggest that, in the presence of interval-censored disease times, it is important to account for interval-censoring not only when estimating the parameters of the model but also when evaluating the discrimination performance of the disease.