Early Risk Prediction with Temporally and Contextually Grounded Clinical Language Processing

πŸ“… 2025-11-26
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This study addresses key challenges in early type-2 diabetes risk prediction from EHR clinical notesβ€”namely, long document length, irregular temporal spacing, complex time dependencies, and constraints on privacy and computational resources. To this end, we propose HiTGNN, a Hierarchical Temporal Graph Neural Network that jointly models fine-grained temporal structures and domain-specific medical knowledge graphs, and ReVeAL, a lightweight verification framework integrating large language model distillation with test-time inference optimization to enable efficient, privacy-preserving, and interpretable predictions. Evaluated on real-world multicenter EHR data, our approach significantly improves short-term risk prediction: AUC increases by 5.2% and sensitivity by 12.7%, while maintaining strong fairness across diverse demographic subgroups. Our core contribution is the first integration of temporal graph modeling, clinical knowledge embedding, and lightweight large-model inference for risk prediction from clinical text.

Technology Category

Application Category

πŸ“ Abstract
Clinical notes in Electronic Health Records (EHRs) capture rich temporal information on events, clinician reasoning, and lifestyle factors often missing from structured data. Leveraging them for predictive modeling can be impactful for timely identification of chronic diseases. However, they present core natural language processing (NLP) challenges: long text, irregular event distribution, complex temporal dependencies, privacy constraints, and resource limitations. We present two complementary methods for temporally and contextually grounded risk prediction from longitudinal notes. First, we introduce HiTGNN, a hierarchical temporal graph neural network that integrates intra-note temporal event structures, inter-visit dynamics, and medical knowledge to model patient trajectories with fine-grained temporal granularity. Second, we propose ReVeAL, a lightweight, test-time framework that distills the reasoning of large language models into smaller verifier models. Applied to opportunistic screening for Type 2 Diabetes (T2D) using temporally realistic cohorts curated from private and public hospital corpora, HiTGNN achieves the highest predictive accuracy, especially for near-term risk, while preserving privacy and limiting reliance on large proprietary models. ReVeAL enhances sensitivity to true T2D cases and retains explanatory reasoning. Our ablations confirm the value of temporal structure and knowledge augmentation, and fairness analysis shows HiTGNN performs more equitably across subgroups.
Problem

Research questions and friction points this paper is trying to address.

Predicts chronic disease risk from EHR clinical notes
Models temporal dependencies in longitudinal patient data
Balances predictive accuracy with privacy and resource constraints
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical temporal graph neural network for patient trajectories
Lightweight verifier models distill large language model reasoning
Temporal and knowledge augmentation for equitable risk prediction
πŸ”Ž Similar Papers
No similar papers found.
Rochana Chaturvedi
Rochana Chaturvedi
Postdoc, Argonne National Lab
Natural Language ProcessingMachine LearningHealthcareClimate ScienceSocial Science
Y
Yue Zhou
University of Illinois Chicago, Chicago, IL, USA
A
Andrew Boyd
University of Illinois Chicago, Chicago, IL, USA
B
Brian T. Layden
University of Illinois Chicago, Chicago, IL, USA
M
Mudassir Rashid
Illinois Institute of Technology, Chicago, IL, USA
Lu Cheng
Lu Cheng
Assistant Professor, UIC CS
Socially Responsible AICausal Machine LearningData MiningAI for Good
A
Ali Cinar
Illinois Institute of Technology, Chicago, IL, USA
Barbara Di Eugenio
Barbara Di Eugenio
Professor, University of Illinois Chicago
Natural Language ProcessingHuman Computer InteractionEducational TechnologyNLP for healthcare