🤖 AI Summary
Traditional survival evaluation metrics (e.g., C-index, Brier score) rely on the independence censoring assumption and yield biased estimates when censoring is dependent on event times. This work addresses the failure of existing evaluation methods under dependent censoring by proposing, for the first time, three Copula-driven survival evaluation metrics. We further introduce the first semi-synthetic benchmark framework featuring controllable dependent censoring for data generation and model assessment. Methodologically, our approach integrates Copula modeling, nonparametric correction techniques, and Monte Carlo simulation to relax the independence censoring constraint, enabling interpretable and reproducible robust evaluation. Empirical results demonstrate that the proposed metrics yield more accurate estimates of prediction error—achieving an average 18.7% improvement in accuracy—and exhibit substantially enhanced robustness under strong dependent censoring scenarios.
📝 Abstract
Conventional survival metrics, such as Harrell's concordance index and the Brier Score, rely on the independent censoring assumption for valid inference in the presence of right-censored data. However, when instances are censored for reasons related to the event of interest, this assumption no longer holds, as this kind of dependent censoring biases the marginal survival estimates of popular nonparametric estimators. In this paper, we propose three copula-based metrics to evaluate survival models in the presence of dependent censoring, and design a framework to create realistic, semi-synthetic datasets with dependent censoring to facilitate the evaluation of the metrics. Our empirical analyses in synthetic and semi-synthetic datasets show that our metrics can give error estimates that are closer to the true error, mainly in terms of predictive accuracy.