🤖 AI Summary
To address the challenge of individualizing intravenous fluid and vasopressor dosing in sepsis management, this paper proposes the first offline reinforcement learning framework based on temporal heterogeneous graphs. It models multimodal clinical time-series data from MIMIC-III as patient-specific heterogeneous graphs and employs GraphSAGE or GATv2 to construct an encoder–decoder architecture for self-supervised representation pretraining—thereby decoupling state representation learning from policy optimization. The framework integrates the dBCQ algorithm to enable safe and efficient offline policy learning. Experiments demonstrate that graph-structured modeling substantially improves state representation quality and that the learned policy outperforms existing baselines. This work provides the first empirical validation of temporal heterogeneous graphs for critical care treatment decision-making and elucidates the pivotal role and optimization pathways of representation learning in medical offline RL.
📝 Abstract
Sepsis is a serious, life-threatening condition. When treating sepsis, it is challenging to determine the correct amount of intravenous fluids and vasopressors for a given patient. While automated reinforcement learning (RL)-based methods have been used to support these decisions with promising results, previous studies have relied on relational data. Given the complexity of modern healthcare data, representing data as a graph may provide a more natural and effective approach. This study models patient data from the well-known MIMIC-III dataset as a heterogeneous graph that evolves over time. Subsequently, we explore two Graph Neural Network architectures - GraphSAGE and GATv2 - for learning patient state representations, adopting the approach of decoupling representation learning from policy learning. The encoders are trained to produce latent state representations, jointly with decoders that predict the next patient state. These representations are then used for policy learning with the dBCQ algorithm. The results of our experimental evaluation confirm the potential of a graph-based approach, while highlighting the complexity of representation learning in this domain.