π€ AI Summary
This study addresses the challenges of skill reuse and misaligned risk assessment in deploying medical AI agents across institutions. It introduces βagent skillsβ as a programmable intermediate layer for medical AI adaptation, leveraging the ClawHub platform to curate 557 healthcare-related skills. Through expert annotation across ten dimensions and empirical analysis, the work systematically characterizes these skills in terms of functionality, deployment contexts, autonomy, and safety. Findings reveal a predominant focus on patient-facing workflow automation, with insufficient coverage of core clinical tasks and uneven representation across the healthcare lifecycle. Moreover, conventional technical risk metrics fail to capture real-world clinical risks. These insights expose critical blind spots in current benchmarking and governance frameworks, offering a novel paradigm for evaluating the transferability and safety of medical AI systems.
π Abstract
Healthcare automation is shaped by local procedures and organizational constraints, so agent capabilities rarely transfer unchanged across settings. Agent skills, self-contained directories that package reusable procedures for AI agents, are emerging as a procedural layer for adapting healthcare agents across diverse healthcare settings. We present the first empirical analysis of healthcare agent skills, drawing on 557 healthcare-related skills filtered from 58,159 public skills on ClawHub and annotated along ten dimensions covering function, deployment context, autonomy, and safety. We find that public healthcare skills emphasize patient-facing workflow automation and monitoring rather than the diagnostic and treatment-oriented tasks foregrounded in healthcare-agent research; coverage of the healthcare lifecycle and specialized clinical inputs remains uneven; and general technical risk does not reliably capture clinical risk. These findings position healthcare skills as a procedural layer not yet addressed by current benchmarks and risk frameworks.