XiCAD: Camera Activation Detection in the Da Vinci Xi User Interface

📅 2025-11-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Automatic identification of endoscopic activation status in da Vinci Xi robotic surgery videos remains challenging due to ambiguous visual cues and frequent multi-camera configurations. Method: We propose a lightweight, high-accuracy UI-aware detection method based on a ResNet18 backbone, trained for end-to-end binary classification. It precisely localizes the camera tile within the native Xi user interface and infers activation state from its visual indicators, outputting real-time camera motion metadata. Contribution/Results: This is the first robust, UI-native activation detector for the Xi system, eliminating false positives from concurrent cameras. Evaluated on >70,000 frames from real surgical videos, it achieves F1 scores of 0.993–1.000, with accurate, non-redundant localization. The method delivers reliable, structured metadata—enabling downstream applications including intraoperative instrument tracking, surgical skill assessment, and automated camera control—thereby addressing a critical gap in da Vinci video metadata parsing.

Technology Category

Application Category

📝 Abstract
Purpose: Robot-assisted minimally invasive surgery relies on endoscopic video as the sole intraoperative visual feedback. The DaVinci Xi system overlays a graphical user interface (UI) that indicates the state of each robotic arm, including the activation of the endoscope arm. Detecting this activation provides valuable metadata such as camera movement information, which can support downstream surgical data science tasks including tool tracking, skill assessment, or camera control automation. Methods: We developed a lightweight pipeline based on a ResNet18 convolutional neural network to automatically identify the position of the camera tile and its activation state within the DaVinci Xi UI. The model was fine-tuned on manually annotated data from the SurgToolLoc dataset and evaluated across three public datasets comprising over 70,000 frames. Results: The model achieved F1-scores between 0.993 and 1.000 for the binary detection of active cameras and correctly localized the camera tile in all cases without false multiple-camera detections. Conclusion: The proposed pipeline enables reliable, real-time extraction of camera activation metadata from surgical videos, facilitating automated preprocessing and analysis for diverse downstream applications. All code, trained models, and annotations are publicly available.
Problem

Research questions and friction points this paper is trying to address.

Detecting camera activation state in DaVinci Xi surgical interface
Automating extraction of camera movement metadata from endoscopic videos
Enabling automated preprocessing for surgical data science applications
Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight ResNet18 pipeline for camera activation detection
Fine-tuned model on SurgToolLoc surgical dataset annotations
Real-time extraction of endoscopic camera metadata
🔎 Similar Papers
No similar papers found.
Alexander C. Jenke
Alexander C. Jenke
PhD Student @ National Center for Tumor Diseases (NCT) Dresden
Surgical Scene SegmentationDeep LearningComputer Assisted SurgerySurgical Data Science
G
Gregor Just
Department of Translational Surgical Oncology, National Center for Tumor Diseases (NCT), NCT/UCC Dresden, a partnership between DKFZ, Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology, and Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Germany.
C
Claas de Boer
Department of Translational Surgical Oncology, National Center for Tumor Diseases (NCT), NCT/UCC Dresden, a partnership between DKFZ, Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology, and Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Germany.
M
Martin Wagner
Department of Visceral, Thoracic and Vascular Surgery, Faculty of Medicine and University Hospital Carl Gustav Carus, TUD, Germany.
Sebastian Bodenstedt
Sebastian Bodenstedt
National Center for Tumor Diseases (NCT) Dresden
Stefanie Speidel
Stefanie Speidel
Professor, National Center for Tumor Diseases (NCT) Dresden
Computer- and robotic-assisted surgerySurgical data science