Discriminative Representation Learning for Clinical Prediction

πŸ“… 2026-03-21
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work proposes a supervised deep learning framework that abandons the conventional self-supervised pretraining paradigm in favor of direct supervision-driven representation learning, explicitly optimizing the geometric structure of learned representations to maximize the ratio of inter-class separation to intra-class variance. By doing so, the model focuses on clinically relevant dimensions, effectively incorporating inductive biases suited for high-quality, outcome-oriented clinical prediction tasks. Trained end-to-end in a single stage, the approach significantly outperforms multiple self-supervised baselines on electronic health record–based prediction tasks such as mortality and readmission risk. The method demonstrates consistent improvements in discriminative performance, calibration, and sample efficiency, highlighting the advantages of supervision-guided representation learning in clinical settings where labeled outcomes are reliable and task-specific.

Technology Category

Application Category

πŸ“ Abstract
Foundation models in healthcare have largely adopted self supervised pretraining objectives inherited from natural language processing and computer vision, emphasizing reconstruction and large scale representation learning prior to downstream adaptation. We revisit this paradigm in outcome centric clinical prediction settings and argue that, when high quality supervision is available, direct outcome alignment may provide a stronger inductive bias than generative pretraining. We propose a supervised deep learning framework that explicitly shapes representation geometry by maximizing inter class separation relative to within class variance, thereby concentrating model capacity along clinically meaningful axes. Across multiple longitudinal electronic health record tasks, including mortality and readmission prediction, our approach consistently outperforms masked, autoregressive, and contrastive pretraining baselines under matched model capacity. The proposed method improves discrimination, calibration, and sample efficiency, while simplifying the training pipeline to a single stage optimization. These findings suggest that in low entropy, outcome driven healthcare domains, supervision can act as the statistically optimal driver of representation learning, challenging the assumption that large scale self supervised pretraining is a prerequisite for strong clinical performance.
Problem

Research questions and friction points this paper is trying to address.

clinical prediction
representation learning
supervised learning
outcome alignment
electronic health records
Innovation

Methods, ideas, or system contributions that make the work stand out.

supervised representation learning
clinical prediction
representation geometry
outcome-driven learning
electronic health records
πŸ”Ž Similar Papers
No similar papers found.
Y
Yang Zhang
The University of Hong Kong (HKU)
L
Li Fan
The University of Hong Kong (HKU)
S
Samuel Lawrence
Columbia University
Shi Li
Shi Li
Professor, Nanjing University
Theoretical Computer Science