A global log for medical AI

πŸ“… 2025-10-05
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Medical AI systems lack standardized logging mechanisms, impeding rigorous real-world performance evaluation, adverse event traceability, bias identification, and data drift monitoring. To address this, we propose MedLogβ€”the first event-level logging protocol specifically designed for clinical AI applications. MedLog defines nine core fields (e.g., model metadata, user context, input/output tensors, timestamps, confidence scores) and integrates risk-aware sampling, lifecycle-adapted storage, and write-after-caching to ensure compatibility with resource-constrained and heterogeneous clinical infrastructures. Implemented as a lightweight, interoperable middleware, MedLog enables fully auditable, end-to-end invocation tracing. It supports transparent regulatory oversight, continuous performance benchmarking, and dynamic risk surveillance. Empirical validation across diverse clinical AI deployments demonstrates negligible latency overhead (<2.3 ms per log entry) and seamless integration with existing EHR and AI orchestration systems. MedLog establishes foundational infrastructure for safe, scalable clinical AI deployment and digital epidemiology research.

Technology Category

Application Category

πŸ“ Abstract
Modern computer systems often rely on syslog, a simple, universal protocol that records every critical event across heterogeneous infrastructure. However, healthcare's rapidly growing clinical AI stack has no equivalent. As hospitals rush to pilot large language models and other AI-based clinical decision support tools, we still lack a standard way to record how, when, by whom, and for whom these AI models are used. Without that transparency and visibility, it is challenging to measure real-world performance and outcomes, detect adverse events, or correct bias or dataset drift. In the spirit of syslog, we introduce MedLog, a protocol for event-level logging of clinical AI. Any time an AI model is invoked to interact with a human, interface with another algorithm, or act independently, a MedLog record is created. This record consists of nine core fields: header, model, user, target, inputs, artifacts, outputs, outcomes, and feedback, providing a structured and consistent record of model activity. To encourage early adoption, especially in low-resource settings, and minimize the data footprint, MedLog supports risk-based sampling, lifecycle-aware retention policies, and write-behind caching; detailed traces for complex, agentic, or multi-stage workflows can also be captured under MedLog. MedLog can catalyze the development of new databases and software to store and analyze MedLog records. Realizing this vision would enable continuous surveillance, auditing, and iterative improvement of medical AI, laying the foundation for a new form of digital epidemiology.
Problem

Research questions and friction points this paper is trying to address.

Lack of standard logging protocol for clinical AI systems usage
Difficulty monitoring real-world performance and detecting adverse events
Absence of structured records for AI model interactions in healthcare
Innovation

Methods, ideas, or system contributions that make the work stand out.

MedLog protocol for clinical AI event logging
Structured records with nine core fields
Risk-based sampling and lifecycle-aware retention policies
πŸ”Ž Similar Papers
No similar papers found.
Ayush Noori
Ayush Noori
A.B./S.M., Harvard University; Rhodes Scholar
Artificial IntelligenceNeurodegenerationPrecision MedicineKnowledge GraphsMultimodal AI
Adam Rodman
Adam Rodman
Assistant Professor of Medicine, Harvard Medical School
Clinical reasoningAIdigital educationmedical history
Alan Karthikesalingam
Alan Karthikesalingam
Google Health
Artificial Intelligence in Healthcare
B
Bilal A. Mateen
University of Birmingham, Birmingham, UK
Christopher A. Longhurst
Christopher A. Longhurst
UC San Diego
Artificial IntelligenceClinical Decision SupportPatient SafetyElectronic Health Records
D
Daniel Yang
Kaiser Foundation Health and Hospitals, Oakland, CA, USA
D
Dave deBronkart
e-Patient Dave, LLC, Nashua, NH, USA
G
Gauden Galea
Regional Office for Europe, World Health Organization, Copenhagen, Denmark
H
Harold F. Wolf III
Healthcare Information and Management Systems Society, Chicago, IL, USA
J
Jacob Waxman
Clalit Research Institute, Innovation Division, Clalit Health Services, Ramat Gan, Israel
J
Joshua C. Mandel
Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
J
Juliana Rotich
Gates Foundation, Seattle, WA, USA
Kenneth D. Mandl
Kenneth D. Mandl
Professor, Harvard Med. Director, Computational Health Informatics Program, Boston Children's
Biomedical InformaticsPopulation Health
Maryam Mustafa
Maryam Mustafa
Lahore University of Management Sciences(LUMS)
HCIHealth TechAI
M
Melissa Miles
Department of Pediatrics, Harvard Medical School, Boston, MA, USA
N
Nigam H. Shah
Technology and Digital Solutions, Stanford Healthcare, Palo Alto, CA, USA
P
Peter Lee
Microsoft Research, Redmond, WA, USA
R
Robert Korom
Penda Health, Nairobi, Kenya
S
Scott Mahoney
Gates Foundation, Seattle, WA, USA
S
Seth Hain
Epic Systems Corporation, Verona, WI, USA
Tien Yin Wong
Tien Yin Wong
Tsinghua Medicine, Tsinghua University, Beijing, China
T
Trevor Mundel
Department of Pediatrics, Harvard Medical School, Boston, MA, USA
V
Vivek Natarajan
Google DeepMind, London, UK
Noa Dagan
Noa Dagan
Clalit Research Institute and Ben-Gurion University, Israel
Clinical prediction modelsCausal inferenceAlgorithmic fairness
David A. Clifton
David A. Clifton
Chair of Clinical Machine Learning, University of Oxford
Machine LearningClinical AIBiomedical Signal Processing