🤖 AI Summary
Balancing oxygenation maintenance and ventilator-induced lung injury (VILI) risk in ICU mechanical ventilation remains a critical clinical challenge.
Method: This paper proposes an interpretable, clinically trustworthy reinforcement learning (RL) framework that integrates causal inference with nonparametric off-policy evaluation. Domain knowledge is explicitly encoded into decision-tree-based policy learning—avoiding the opacity and unreliability of black-box deep RL models—and the framework is trained and validated on real-world MIMIC-III ICU data.
Contribution/Results: The method achieves SpO₂ target attainment comparable to state-of-the-art deep RL approaches and significantly outperforms behavior cloning. Crucially, its decision logic is fully traceable and verifiable, enabling clinical transparency, auditability, and seamless integration into clinical workflows. By prioritizing interpretability, safety, and domain alignment, this work establishes a novel paradigm for the reliable, clinically viable deployment of RL in critical care.
📝 Abstract
Mechanical ventilation is a critical life support intervention that delivers controlled air and oxygen to a patient's lungs, assisting or replacing spontaneous breathing. While several data-driven approaches have been proposed to optimize ventilator control strategies, they often lack interpretability and alignment with domain knowledge, hindering clinical adoption. This paper presents a methodology for interpretable reinforcement learning (RL) aimed at improving mechanical ventilation control as part of connected health systems. Using a causal, nonparametric model-based off-policy evaluation, we assess RL policies for their ability to enhance patient-specific outcomes-specifically, increasing blood oxygen levels (SpO2), while avoiding aggressive ventilator settings that may cause ventilator-induced lung injuries and other complications. Through numerical experiments on real-world ICU data from the MIMIC-III database, we demonstrate that our interpretable decision tree policy achieves performance comparable to state-of-the-art deep RL methods while outperforming standard behavior cloning approaches. The results highlight the potential of interpretable, data-driven decision support systems to improve safety and efficiency in personalized ventilation strategies, paving the way for seamless integration into connected healthcare environments.