Let the Experts Speak: Improving Survival Prediction&Calibration via Mixture-of-Experts Heads

📅 2025-11-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In survival analysis, conventional group-based modeling often compromises both predictive accuracy and calibration. Method: This paper proposes a discrete-time deep Mixture-of-Experts (MoE) model that abandons the restrictive inductive bias of hard expert assignment inherent in standard MoE architectures. Instead, it introduces a learnable gating mechanism and highly expressive, patient-specific expert networks to decouple subgroup discovery from individualized risk prediction. The model is trained end-to-end to jointly optimize clustering fidelity and prediction quality. Contribution/Results: Experiments across multiple clinical survival datasets demonstrate consistent improvements over state-of-the-art baselines: lower calibration error (measured by Brier score and Expected Calibration Error) and higher predictive performance (measured by Concordance index and Integrated Brier Score). These results validate the effectiveness of co-optimizing interpretable subgroup structure learning with accurate, well-calibrated risk prediction.

Technology Category

Application Category

📝 Abstract
Deep mixture-of-experts models have attracted a lot of attention for survival analysis problems, particularly for their ability to cluster similar patients together. In practice, grouping often comes at the expense of key metrics such calibration error and predictive accuracy. This is due to the restrictive inductive bias that mixture-of-experts imposes, that predictions for individual patients must look like predictions for the group they're assigned to. Might we be able to discover patient group structure, where it exists, while improving calibration and predictive accuracy? In this work, we introduce several discrete-time deep mixture-of-experts (MoE) based architectures for survival analysis problems, one of which achieves all desiderata: clustering, calibration, and predictive accuracy. We show that a key differentiator between this array of MoEs is how expressive their experts are. We find that more expressive experts that tailor predictions per patient outperform experts that rely on fixed group prototypes.
Problem

Research questions and friction points this paper is trying to address.

Improving survival prediction accuracy and calibration in mixture-of-experts models
Overcoming restrictive inductive bias in patient grouping strategies
Developing expressive experts that tailor predictions per patient
Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture-of-experts architecture improves survival analysis
Expressive experts tailor predictions per patient
Discrete-time deep models enhance calibration and accuracy
🔎 Similar Papers
No similar papers found.
T
Todd Morrill
Department of Computer Science, Columbia University, USA
A
Aahlad Puli
Department of Computer Science, New York University, USA
Murad Megjhani
Murad Megjhani
Department of Neurology, Columbia University Medical Center, USA Department of Computer Science, Barnard College, USA
S
Soojin Park
Department of Neurology, Columbia University Medical Center, USA Department of Biomedical Informatics, Columbia University Medical Center, USA NewYork-Presbyterian Hospital at Columbia University Medical Center, USA
Richard Zemel
Richard Zemel
Professor of Computer Science, University of Toronto
Machine LearningComputer VisionNeural Coding