🤖 AI Summary
This study addresses the challenge of individualized optimization of mechanical ventilation (MV) parameters in intensive care units (ICUs), where existing offline reinforcement learning (RL) methods struggle to model hybrid (continuous + discrete) action spaces and suffer from distributional shift and safety risks induced by coarse action discretization. We propose the first offline RL framework explicitly designed for hybrid action spaces, integrating an enhanced Implicit Q-Learning (IQL) algorithm with Ensemble-Diversity Actor-Critic (EDAC). A clinically grounded, dense reward function is designed based on key outcomes—including ventilator-free days and physiological target attainment—and augmented with clinical-knowledge-guided reward shaping. Evaluated on real-world ICU data, our approach significantly improves policy safety and patient-specific adaptability. Retrospective analysis demonstrates enhanced physiological stability and increased ventilator-free days, establishing a clinically viable paradigm for AI-assisted critical care ventilation decision-making.
📝 Abstract
Invasive mechanical ventilation (MV) is a life-sustaining therapy for critically ill patients in the intensive care unit (ICU). However, optimizing its settings remains a complex and error-prone process due to patient-specific variability. While Offline Reinforcement Learning (RL) shows promise for MV control, current stateof-the-art (SOTA) methods struggle with the hybrid (continuous and discrete) nature of MV actions. Discretizing the action space limits available actions due to exponential growth in combinations and introduces distribution shifts that can compromise safety. In this paper, we propose optimizations that build upon prior work in action space reduction to address the challenges of discrete action spaces. We also adapt SOTA offline RL algorithms (IQL and EDAC) to operate directly on hybrid action spaces, thereby avoiding the pitfalls of discretization. Additionally, we introduce a clinically grounded reward function based on ventilator-free days and physiological targets, which provides a more meaningful optimization objective compared to traditional sparse mortality-based rewards. Our findings demonstrate that AI-assisted MV optimization may enhance patient safety and enable individualized lung support, representing a significant advancement toward intelligent, data-driven critical care solutions.