In-Training Multicalibrated Survival Analysis for Healthcare via Constrained Optimization

πŸ“… 2025-07-03
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing survival analysis models are typically calibrated only at the population level, leading to miscalibration within minority subpopulations and increased risk of clinical misjudgment. To address this, we propose GRADUATEβ€”a novel framework that formulates multi-subpopulation calibration as a constrained optimization problem, jointly optimizing predictive discrimination and cross-group calibration during training. We provide theoretical guarantees on the near-optimality and feasibility of its solution. By introducing a multi-calibration loss, GRADUATE enforces predicted probabilities across subpopulations to converge toward their respective true event rates, while preserving high predictive accuracy. Evaluated on multiple real-world clinical datasets, GRADUATE significantly outperforms state-of-the-art methods, achieving consistently high calibration accuracy across diverse subpopulations. This work establishes a new paradigm for fair and reliable individualized prognostic assessment.

Technology Category

Application Category

πŸ“ Abstract
Survival analysis is an important problem in healthcare because it models the relationship between an individual's covariates and the onset time of an event of interest (e.g., death). It is important for survival models to be well-calibrated (i.e., for their predicted probabilities to be close to ground-truth probabilities) because badly calibrated systems can result in erroneous clinical decisions. Existing survival models are typically calibrated at the population level only, and thus run the risk of being poorly calibrated for one or more minority subpopulations. We propose a model called GRADUATE that achieves multicalibration by ensuring that all subpopulations are well-calibrated too. GRADUATE frames multicalibration as a constrained optimization problem, and optimizes both calibration and discrimination in-training to achieve a good balance between them. We mathematically prove that the optimization method used yields a solution that is both near-optimal and feasible with high probability. Empirical comparisons against state-of-the-art baselines on real-world clinical datasets demonstrate GRADUATE's efficacy. In a detailed analysis, we elucidate the shortcomings of the baselines vis-a-vis GRADUATE's strengths.
Problem

Research questions and friction points this paper is trying to address.

Ensures survival models are well-calibrated for all subpopulations
Balances calibration and discrimination via constrained optimization
Addresses poor calibration in minority groups in healthcare
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multicalibration via constrained optimization
Optimizes calibration and discrimination in-training
Ensures near-optimal feasible solution mathematically
πŸ”Ž Similar Papers
No similar papers found.
T
Thiti Suttaket
Department of Information Systems and Analytics, National University of Singapore, 13 Computing Drive, Singapore 117417
Stanley Kok
Stanley Kok
National University of Singapore
Artificial IntelligenceMachine LearningInformation Systems