The SAGES Critical View of Safety Challenge: A Global Benchmark for AI-Assisted Surgical Quality Assessment

📅 2025-09-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In laparoscopic cholecystectomy, inconsistent adherence to safety protocols, high subjectivity in manual assessment, and poor robustness against clinical variability hinder reliable surgical quality evaluation. Method: We constructed a large-scale, multicenter video dataset comprising 1,000 procedures from 54 centers across 24 countries—the first AI competition initiated by a surgical society—and introduced a rigorous multiviewer, multicenter consensus validation framework. Leveraging the EndoGlacier platform, we enabled heterogeneous video management and collaborative annotation, integrating temporal action recognition with confidence calibration. Contribution/Results: Our approach significantly enhances model reliability and generalizability: it achieves up to a 17% performance gain over state-of-the-art methods, reduces calibration error by over 80%, and improves robustness by 17%. This work establishes the first clinically validated, deployable AI benchmark for objective, scalable surgical quality assessment.

Technology Category

Application Category

📝 Abstract
Advances in artificial intelligence (AI) for surgical quality assessment promise to democratize access to expertise, with applications in training, guidance, and accreditation. This study presents the SAGES Critical View of Safety (CVS) Challenge, the first AI competition organized by a surgical society, using the CVS in laparoscopic cholecystectomy, a universally recommended yet inconsistently performed safety step, as an exemplar of surgical quality assessment. A global collaboration across 54 institutions in 24 countries engaged hundreds of clinicians and engineers to curate 1,000 videos annotated by 20 surgical experts according to a consensus-validated protocol. The challenge addressed key barriers to real-world deployment in surgery, including achieving high performance, capturing uncertainty in subjective assessment, and ensuring robustness to clinical variability. To enable this scale of effort, we developed EndoGlacier, a framework for managing large, heterogeneous surgical video and multi-annotator workflows. Thirteen international teams participated, achieving up to a 17% relative gain in assessment performance, over 80% reduction in calibration error, and a 17% relative improvement in robustness over the state-of-the-art. Analysis of results highlighted methodological trends linked to model performance, providing guidance for future research toward robust, clinically deployable AI for surgical quality assessment.
Problem

Research questions and friction points this paper is trying to address.

Developing AI benchmarks for surgical quality assessment in laparoscopic cholecystectomy
Addressing barriers to real-world AI deployment in surgical settings
Creating robust AI systems for subjective surgical safety evaluations
Innovation

Methods, ideas, or system contributions that make the work stand out.

EndoGlacier framework for managing surgical video workflows
AI competition addressing uncertainty in subjective assessment
Multi-annotator consensus protocol for surgical quality validation
🔎 Similar Papers
No similar papers found.
D
Deepak Alapatt
Scialytics SAS, France
J
Jennifer Eckhoff
University Hospital Cologne, Germany
Z
Zhiliang Lyu
Massachusetts General Hospital, Harvard Medical School, USA
Y
Yutong Ban
Global College, Shanghai Jiao Tong University, China
J
Jean-Paul Mazellier
IHU Strasbourg, France
S
Sarah Choksi
Lenox Hill Hospital, Northwell Health, USA
K
Kunyi Yang
Global College, Shanghai Jiao Tong University, China
2
2024 CVS Challenge Consortium
Quanzheng Li
Quanzheng Li
Massachusetts General Hospital, Harvard Medical School
Image ReconstructionMedical Image AnalysisDeep Learning in MedicineMultimodality Medical Data Analysis
F
Filippo Filicori
Lenox Hill Hospital, Northwell Health, USA
X
Xiang Li
Massachusetts General Hospital, Harvard Medical School, USA
Pietro Mascagni
Pietro Mascagni
Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy; Institute of Image Guided
SurgerySurgical Data ScienceSurgical EducationSurgical Safety
D
Daniel A. Hashimoto
University of Pennsylvania, USA
Guy Rosman
Guy Rosman
Toyota Research Institute; Massachusetts General Hospital; Duke Surgery
Computer vision and robotic perceptionBayesian inferencetrajectory prediction
O
Ozanan Meireles
Duke University, USA
Nicolas Padoy
Nicolas Padoy
Professor of Computer Science, University of Strasbourg
Surgical Data ScienceMedical Image AnalysisComputer VisionMachine Learning