Improving Survival Models in Healthcare by Balancing Imbalanced Cohorts: A Novel Approach

📅 2025-10-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Medical survival models often exhibit poor discrimination and miscalibration in high- and low-risk subgroups—clinically critical populations where treatment decisions are made. Method: We propose a model-agnostic baseline prognostic risk-stratified matching sampling approach that rebalances the training data distribution to enhance learning from sparse tail subpopulations. Using Cox regression, we evaluate performance via Harrell’s C-index, time-dependent AUC, and integrated calibration index, with internal validation via Efron bias-corrected bootstrap. External validation is conducted on two independent cohorts of colorectal liver metastases (CRLM). Contribution/Results: Our method significantly improves discrimination (increased C-index) and calibration in both high- and low-risk subgroups, particularly enhancing prediction reliability for infrequent subpopulations. It demonstrates robust generalizability across independent external cohorts, validating the efficacy and transportability of data rebalancing for survival modeling.

Technology Category

Application Category

📝 Abstract
We explore whether survival model performance in underrepresented high- and low-risk subgroups - regions of the prognostic spectrum where clinical decisions are most consequential - can be improved through targeted restructuring of the training dataset. Rather than modifying model architecture, we propose a novel risk-stratified sampling method that addresses imbalances in prognostic subgroup density to support more reliable learning in underrepresented tail strata. We introduce a novel methodology that partitions patients by baseline prognostic risk and applies matching within each stratum to equalize representation across the risk distribution. We implement this framework on a cohort of 1,799 patients with resected colorectal liver metastases (CRLM), including 1,197 who received adjuvant chemotherapy and 602 who did not. All models used in this study are Cox proportional hazards models trained on the same set of selected variables. Model performance is assessed via Harrell's C index, time-dependent AUC, and Integrated Calibration Index (ICI), with internal validation using Efron's bias-corrected bootstrapping. External validation is conducted on two independent CRLM datasets. Cox models trained on risk-balanced cohorts showed consistent improvements in internal validation compared to models trained on the full dataset while noticeably enhancing stratified C-index values in underrepresented high- and low-risk strata of the external cohorts. Our findings suggest that survival model performance in observational oncology cohorts can be meaningfully improved through targeted rebalancing of the training data across prognostic risk strata. This approach offers a practical and model-agnostic complement to existing methods, especially in applications where predictive reliability across the full risk continuum is critical to downstream clinical decisions.
Problem

Research questions and friction points this paper is trying to address.

Improving survival model performance in underrepresented high-risk subgroups
Addressing imbalances in prognostic subgroup density through risk-stratified sampling
Enhancing predictive reliability across full risk continuum for clinical decisions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Risk-stratified sampling balances imbalanced prognostic subgroups
Matching within strata equalizes representation across risk distribution
Model-agnostic data restructuring improves survival prediction reliability
🔎 Similar Papers
No similar papers found.
C
Catherine Ning
Operations Research Center, Massachusetts Institute of Technology, Cambridge, MA, USA
Dimitris Bertsimas
Dimitris Bertsimas
Boeing Professor of Operations Research, MIT
Operations ResearchOptimizationStochasticsAnalyticsHealth Care
J
Johan Gagnière
Department of Digestive and Hepatobiliary Surgery –Liver Transplantation U1071 Inserm/Clermont -Auvergne University Hospital of Clermont -Ferrand, Clermont -Ferrand, France
S
Stefan Buettner
Department of Surgery, Erasmus MC University Medical Centre, Rotterdam, The Netherlands
P
Per Eystein Lønning
Department of Clinical Science, University of Bergen, Department of Oncology, Haukeland University Hospital, Bergen, Norway
H
Hideo Baba
Department of Gastroenterological Surgery, Graduate School of Medical Sciences, Kumamoto University, Kumamoto, Japan
I
Itaru Endo
Department of Gastroenterological Surgery, Yokohama City University Graduate School of Medicine, Yokohama, Japan
G
Georgios Stasinos
Technical Chamber of Greece, Athens, Greece
R
Richard Burkhart
Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, MD, USA
F
Federico N. Auecio
Department of General Surgery, Digestive Disease Institute, Cleveland Clinic, Cleveland, OH, USA
F
Felix Balzer
Charité – Universitätsmedizin Berlin, Berlin, Germany
C
Cornelis Verhoef
Department of Surgery, Erasmus MC University Medical Centre, Rotterdam, The Netherlands
M
Martin E. Kreis
Department of General and Visceral Surgery, Charité Campus Benjamin Franklin, Berlin, Germany
G
Georgios Antonios Margonis
Operations Research Center, Massachusetts Institute of Technology, Cambridge, MA, USA; Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY, USA