🤖 AI Summary
This work addresses the challenge posed by the complex heterogeneity of electronic health records (EHRs), which limits the accuracy of clinical outcome prediction. To overcome this, the authors propose ReMedi, a novel framework that leverages real clinical outcomes as supervision signals to establish a challenging-sample regeneration mechanism. This mechanism generates diverse reasoning–answer pairs guided by ground-truth answer prompts and integrates them into a preference data construction loop. By synergistically combining large language models, chain-of-thought generation, supervised fine-tuning, and preference optimization, ReMedi substantially enhances the model’s causal reasoning capabilities and prediction consistency. Evaluated across multiple EHR-based prediction tasks, ReMedi achieves up to a 19.9% improvement in F1 score, significantly outperforming current state-of-the-art methods.
📝 Abstract
Predicting future clinical outcomes from electronic health records (EHR) remains challenging due to the complexity and heterogeneity of patient data. LLMs have shown strong potential for such predictive tasks, yet existing approaches mainly focus on enhancing medical knowledge through distillation or RAG while relying on the model's internal ability to interpret contextual information. In this work, we present ReMedi (Reasoner for Medical Clinical Prediction), a framework for improving clinical outcome prediction from EHR. ReMedi generates rationale-answer pairs using a challenging sample regeneration mechanism for complex clinical questions, which leverages ground-truth answers as hints to enhance reasoning for further fine-tuning and preference tuning. ReMedi integrates ground-truth outcome guidance into the preference data construction loop, regenerating rationale-answer variants. By tuning on these rationale-answer pairs, the model improves its predictive performance. Experiments on multiple EHR prediction tasks demonstrate substantial gains of up to 19.9 percent over state-of-the-art baselines in terms of F1 score, underscoring ReMedi's effectiveness in real-world clinical prediction.