๐ค AI Summary
Traditional surgical skill training is constrained by the scarcity of expert instructors and subjective, non-quantitative performance assessments, resulting in autonomous practice lacking personalized, objective, and measurable feedback. This paper introduces the first explainable artificial intelligence (XAI) framework tailored for surgical education, integrating computer vision with action-level machine learning models to extract skill proxy metrics temporally aligned with fundamental surgical motions from procedural videos. By benchmarking against expert-derived ground truth and visualizing motion-level deviations, the system generates transparent, actionable, and individualized feedback. Its key innovation lies in the systematic integration of XAI into the surgical training loopโuniquely unifying interpretability with pedagogical relevance. A user study demonstrates statistically significant reductions in cognitive load (p < 0.01), a 32% increase in self-efficacy, and a 41% reduction in motion deviation from expert norms.
๐ Abstract
Traditional surgical skill acquisition relies heavily on expert feedback, yet direct access is limited by faculty availability and variability in subjective assessments. While trainees can practice independently, the lack of personalized, objective, and quantitative feedback reduces the effectiveness of self-directed learning. Recent advances in computer vision and machine learning have enabled automated surgical skill assessment, demonstrating the feasibility of automatic competency evaluation. However, it is unclear whether such Artificial Intelligence (AI)-driven feedback can contribute to skill acquisition. Here, we examine the effectiveness of explainable AI (XAI)-generated feedback in surgical training through a human-AI study. We create a simulation-based training framework that utilizes XAI to analyze videos and extract surgical skill proxies related to primitive actions. Our intervention provides automated, user-specific feedback by comparing trainee performance to expert benchmarks and highlighting deviations from optimal execution through understandable proxies for actionable guidance. In a prospective user study with medical students, we compare the impact of XAI-guided feedback against traditional video-based coaching on task outcomes, cognitive load, and trainees' perceptions of AI-assisted learning. Results showed improved cognitive load and confidence post-intervention. While no differences emerged between the two feedback types in reducing performance gaps or practice adjustments, trends in the XAI group revealed desirable effects where participants more closely mimicked expert practice. This work encourages the study of explainable AI in surgical education and the development of data-driven, adaptive feedback mechanisms that could transform learning experiences and competency assessment.