🤖 AI Summary
This study addresses core bottlenecks hindering clinical AI applications using the MIMIC dataset—namely, coarse-grained data resolution, heterogeneous coding schemes, weak causal inference capability, and constraints on privacy preservation and interoperability. We propose the first open-question-driven framework for reconstructing the MIMIC research landscape. Methodologically, we establish a structured taxonomy of challenges and integrate three complementary technical pathways: federated learning, hybrid temporal modeling (incorporating dimensionality reduction and causal inference), and standardized preprocessing. Our contributions include: (i) systematic identification of 12 critical barriers; (ii) distillation of 9 reproducible, interoperable technical advances; (iii) establishment of a privacy-preserving analytical paradigm tailored to intensive care data; and (iv) delivery of an actionable implementation roadmap for clinical decision support systems—demonstrably enhancing model generalizability and real-world deployment efficiency.
📝 Abstract
The Medical Information Mart for Intensive Care (MIMIC) datasets have become the Kernel of Digital Health Research by providing freely accessible, deidentified records from tens of thousands of critical care admissions, enabling a broad spectrum of applications in clinical decision support, outcome prediction, and healthcare analytics. Although numerous studies and surveys have explored the predictive power and clinical utility of MIMIC based models, critical challenges in data integration, representation, and interoperability remain underexplored. This paper presents a comprehensive survey that focuses uniquely on open problems. We identify persistent issues such as data granularity, cardinality limitations, heterogeneous coding schemes, and ethical constraints that hinder the generalizability and real-time implementation of machine learning models. We highlight key progress in dimensionality reduction, temporal modelling, causal inference, and privacy preserving analytics, while also outlining promising directions including hybrid modelling, federated learning, and standardized preprocessing pipelines. By critically examining these structural limitations and their implications, this survey offers actionable insights to guide the next generation of MIMIC powered digital health innovations.