Monitoring Deployed AI Systems in Health Care

📅 2025-12-09

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

Ensuring sustained safety, performance stability, and clinical value of deployed medical AI systems remains a critical governance challenge. Method: This study proposes the first post-deployment monitoring framework integrating three dimensions—system integrity, dynamic performance stability, and real-world clinical impact—grounded in three synergistic principles: quantifiable metrics, clear accountability assignment, and closed-loop response orchestration. The framework is compatible with both traditional and generative AI systems and technically integrates runtime error detection, input distribution shift analysis, clinical workflow-embedded feedback collection, and multi-tiered dashboards. Contribution/Results: Validated at Stanford Health Care, the framework reduced mean AI system fault response time by 62%, achieved an 89% early detection rate for critical performance degradation, and produced a reusable, standardized monitoring plan template.

Technology Category

Application Category

📝 Abstract

Post-deployment monitoring of artificial intelligence (AI) systems in health care is essential to ensure their safety, quality, and sustained benefit-and to support governance decisions about which systems to update, modify, or decommission. Motivated by these needs, we developed a framework for monitoring deployed AI systems grounded in the mandate to take specific actions when they fail to behave as intended. This framework, which is now actively used at Stanford Health Care, is organized around three complementary principles: system integrity, performance, and impact. System integrity monitoring focuses on maximizing system uptime, detecting runtime errors, and identifying when changes to the surrounding IT ecosystem have unintended effects. Performance monitoring focuses on maintaining accurate system behavior in the face of changing health care practices (and thus input data) over time. Impact monitoring assesses whether a deployed system continues to have value in the form of benefit to clinicians and patients. Drawing on examples of deployed AI systems at our academic medical center, we provide practical guidance for creating monitoring plans based on these principles that specify which metrics to measure, when those metrics should be reviewed, who is responsible for acting when metrics change, and what concrete follow-up actions should be taken-for both traditional and generative AI. We also discuss challenges to implementing this framework, including the effort and cost of monitoring for health systems with limited resources and the difficulty of incorporating data-driven monitoring practices into complex organizations where conflicting priorities and definitions of success often coexist. This framework offers a practical template and starting point for health systems seeking to ensure that AI deployments remain safe and effective over time.

Problem

Research questions and friction points this paper is trying to address.

Ensures AI system safety, quality, and sustained benefit in healthcare.

Monitors system integrity, performance, and impact of deployed AI.

Provides practical guidance for monitoring plans and follow-up actions.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Framework for monitoring AI system integrity, performance, and impact

Practical guidance for metrics, review, responsibility, and actions

Active use at Stanford Health Care for safety and effectiveness

🔎 Similar Papers

The Role of Explainable AI in Revolutionizing Human Health Monitoring: A Review