π€ AI Summary
In survival analysis, existing models lack reliable, model-agnostic uncertainty quantification, severely hindering their trustworthy deployment in high-stakes clinical decision-making. This paper proposes a post-hoc meta-modeling framework that enables uncertainty estimation without modifying the underlying survival model. Its key contributions are: (1) the first anchor-driven learning strategy that incorporates C-index consistency knowledge into meta-model optimization; and (2) the first comprehensive evaluation pipeline tailored to survival uncertainty, covering selective prediction, false positive detection, and out-of-distribution identification. Extensive experiments across four public datasets and five state-of-the-art survival models demonstrate consistent superiority over baselines: selective prediction AUC improves by up to 12.3%, significantly enhancing predictive reliability and robustness.
π Abstract
Survival analysis, which estimates the probability of event occurrence over time from censored data, is fundamental in numerous real-world applications, particularly in high-stakes domains such as healthcare and risk assessment. Despite advances in numerous survival models, quantifying the uncertainty of predictions from these models remains underexplored and challenging. The lack of reliable uncertainty quantification limits the interpretability and trustworthiness of survival models, hindering their adoption in clinical decision-making and other sensitive applications. To bridge this gap, in this work, we introduce SurvUnc, a novel meta-model based framework for post-hoc uncertainty quantification for survival models. SurvUnc introduces an anchor-based learning strategy that integrates concordance knowledge into meta-model optimization, leveraging pairwise ranking performance to estimate uncertainty effectively. Notably, our framework is model-agnostic, ensuring compatibility with any survival model without requiring modifications to its architecture or access to its internal parameters. Especially, we design a comprehensive evaluation pipeline tailored to this critical yet overlooked problem. Through extensive experiments on four publicly available benchmarking datasets and five representative survival models, we demonstrate the superiority of SurvUnc across multiple evaluation scenarios, including selective prediction, misprediction detection, and out-of-domain detection. Our results highlight the effectiveness of SurvUnc in enhancing model interpretability and reliability, paving the way for more trustworthy survival predictions in real-world applications.