CURA: Clinical Uncertainty Risk Alignment for Language Model-Based Risk Prediction

📅 2026-04-16

📈 Citations: 0

✨ Influential: 0

career value

167K/year

🤖 AI Summary

This work addresses the limited clinical reliability of language models in risk prediction due to poor uncertainty calibration. The authors propose a joint optimization framework that first fine-tunes a domain-specific clinical language model to obtain patient embeddings and then refines a multi-head classifier using a dual-objective uncertainty-aware loss. This loss incorporates an instance-level calibration term that aligns predictive uncertainty with actual errors, and a group-aware regularization term that leverages neighborhood event rates to construct soft labels, thereby modeling population-level ambiguity; label smoothing is also employed to further improve robustness. Evaluated across multiple tasks on the MIMIC-IV dataset, the method significantly enhances calibration performance, effectively mitigates overconfidence, and improves the trustworthiness of clinical decision support without compromising discriminative accuracy.

Technology Category

Application Category

📝 Abstract

Clinical language models (LMs) are increasingly applied to support clinical risk prediction from free-text notes, yet their uncertainty estimates often remain poorly calibrated and clinically unreliable. In this work, we propose Clinical Uncertainty Risk Alignment (CURA), a framework that aligns clinical LM-based risk estimates and uncertainty with both individual error likelihoods and cohort-level ambiguities. CURA first fine-tunes domain-specific clinical LMs to obtain task-adapted patient embeddings, and then performs uncertainty fine-tuning of a multi-head classifier using a bi-level uncertainty objective. Specifically, an individual-level calibration term aligns predictive uncertainty with each patient's likelihood of error, while a cohort-aware regularizer pulls risk estimates toward event rates in their local neighborhoods in the embedding space and places extra weight on ambiguous cohorts near the decision boundary. We further show that this cohort-aware term can be interpreted as a cross-entropy loss with neighborhood-informed soft labels, providing a label-smoothing view of our method. Extensive experiments on MIMIC-IV clinical risk prediction tasks across various clinical LMs show that CURA consistently improves calibration metrics without substantially compromising discrimination. Further analysis illustrates that CURA reduces overconfident false reassurance and yields more trustworthy uncertainty estimates for downstream clinical decision support.

Problem

Research questions and friction points this paper is trying to address.

clinical language models

risk prediction

uncertainty calibration

clinical decision support

model reliability

Innovation

Methods, ideas, or system contributions that make the work stand out.

uncertainty calibration

clinical language models

risk prediction