Intuitive Surgical SurgToolLoc Challenge Results: 2022-2023

📅 2023-05-11
📈 Citations: 17
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of real-time, robust surgical instrument localization in minimally invasive robotic-assisted surgery (RAS) video streams, this work introduces SurgToolLoc—the first large-scale, multi-view, multi-scenario benchmark dataset with pixel-level mask annotations. We further propose a novel evaluation protocol emphasizing both cross-center generalizability and real-time inference (≥30 FPS). Methodologically, we integrate instance segmentation and keypoint detection with temporal modeling (ConvLSTM/Transformer), domain adaptation, and weakly supervised learning. Our best-performing model achieves 92.4% mAP@0.5 on the test set while maintaining an inference speed of 36 FPS—substantially outperforming conventional template matching and early CNN-based approaches. The solution has undergone rigorous preclinical validation across multiple surgical scenarios. By providing a reproducible, scalable, end-to-end framework for visual instrument localization in RAS, this work establishes a new standard for benchmarking and advancing vision-based surgical navigation systems.
📝 Abstract
Robotic assisted (RA) surgery promises to transform surgical intervention. Intuitive Surgical is committed to fostering these changes and the machine learning models and algorithms that will enable them. With these goals in mind we have invited the surgical data science community to participate in a yearly competition hosted through the Medical Imaging Computing and Computer Assisted Interventions (MICCAI) conference. With varying changes from year to year, we have challenged the community to solve difficult machine learning problems in the context of advanced RA applications. Here we document the results of these challenges, focusing on surgical tool localization (SurgToolLoc). The publicly released dataset that accompanies these challenges is detailed in a separate paper arXiv:2501.09209 [1].
Problem

Research questions and friction points this paper is trying to address.

Develop machine learning models for robotic-assisted surgery.
Focus on surgical tool localization in advanced applications.
Document results of yearly MICCAI conference challenges.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Robotic assisted surgery with machine learning
Annual MICCAI competition for surgical data
Surgical tool localization using public datasets
Aneeq Zia
Aneeq Zia
Manager, Machine Learning Engineering and MLOps, Intuitive
Computer VisionMachine LearningDeep LearningRobotics
K
Kiran D. Bhattacharyya
Intuitive Surgical, Inc.
X
Xi Liu
Intuitive Surgical, Inc.
Max Berniker
Max Berniker
Intuitive Surgical
machine learningBayesian inferenceneural networksmotor control and learningcomputational neuroscience
Z
Ziheng Wang
Intuitive Surgical, Inc.
R
Rogerio G. Nespolo
Intuitive Surgical, Inc.
Satoshi Kondo
Satoshi Kondo
Muroran Institute of Technology (formerly, Konica Minolta, Inc., Panasonic corp.)
Computer vision
S
S. Kasai
Niigata University of Health and Welfare
K
Kousuke Hirasawa
Konica Minolta, Inc
B
Bo Liu
NVIDIA, Inc.
David Austin
David Austin
Deakin University
Psychology
Y
Yiheng Wang
NVIDIA, Inc.
M
Michal Futrega
NVIDIA, Inc.
J
J. Puget
NVIDIA, Inc.
Zhenqiang Li
Zhenqiang Li
University of Tokyo
Yoichi Sato
Yoichi Sato
Professor, Institute of Industrial Science, The University of Tokyo
Computer VisionHuman Computer Interaction
R
Ryoske Fujii
Keio University
Ryo Hachiuma
Ryo Hachiuma
NVIDIA
Computer VisionMachine Learning
Mana Masuda
Mana Masuda
SB Intuitions
Computer VisionMachine LearningDeep Learning
H
H. Saito
Keio University
A
An-Chi Wang
Shun Hing Institute of Advanced Engineering
Mengya Xu
Mengya Xu
The Chinese University of Hong Kong
Vision-Language based Surgical Scene Understanding
M
Mobarakol Islam
Wellcome EPSRC Centre for Interventional and Surgical Sciences
Long Bai
Long Bai
Research Assistant, Institute of Computing Technology, Chinese Academy of Sciences
Event-Centric AnalysisKnowledge GraphNatural Language Processing
W
Winnie Pang
National University of Singapore (NUS), NUSRI SZ
Hongliang Ren
Hongliang Ren
Chinese University of Hong Kong | National University of Singapore | JHU/Harvard(RF) | CUHK(PhD)
Biorobotics & intelligent systemsmedical mechatronicscontinuumsoft flexible robots/sensorsmultisensory perception
C
C. Nwoye
University of Strasbourg, IHU Strasbourg
L
Luca Sestini
Politecnico di Milano, IHU Strasbourg
N
N. Padoy
University of Strasbourg, IHU Strasbourg
M
M. Nielsen
University Medical Center Hamburg-Eppendorf
S
Samuel Schuttler
University Medical Center Hamburg-Eppendorf
T
T. Sentker
University Medical Center Hamburg-Eppendorf
H
Hümeyra Husseini
University Medical Center Hamburg-Eppendorf
I
Ivo M. Baltruschat
University Medical Center Hamburg-Eppendorf
R
Rüdiger Schmitz
University Medical Center Hamburg-Eppendorf
R
R. Werner
University Medical Center Hamburg-Eppendorf
A
Aleksandr Matsun
Bin Zayed University of Artificial Intelligence
Mugariya Farooq
Mugariya Farooq
Mohamed Bin Zayed University of Artificial Intelligence, Technology Innovation Institute
GenomicsMachine LearningBio-informaticsNatural Language Processing
N
Numan Saaed
Bin Zayed University of Artificial Intelligence
J
Jose Renato Restom Viera
Bin Zayed University of Artificial Intelligence
Mohammad Yaqub
Mohammad Yaqub
Researcher in Biomedical Engineering, Associate professor at MBZUAI
Artificial IntelligenceMedical Image AnalysisMachine LearningDeep learning
N
N. Getty
Argonne National Laboratory, University of Chicago, University of Illinois at Chicago
Fangfang Xia
Fangfang Xia
Scientist, Argonne National Laboratory
Machine LearningBioinformaticsNeuromorphic ComputingHigh Performance Computing
Z
Zixuan Zhao
Argonne National Laboratory, University of Chicago, University of Illinois at Chicago
X
Xiaotian Duan
Argonne National Laboratory, University of Chicago, University of Illinois at Chicago
X
X. Yao
Vanderbilt University
Ange Lou
Ange Lou
Vanderbilt University
Medical image analysisImage-guided Surgery
H
Hao Yang
Vanderbilt University
Jin Han
Jin Han
The University of Tokyo, National Institute of Informatics
computer vision
J
J. Noble
Vanderbilt University
J
J. Wu
Vanderbilt University
T
T. A. Alshirbaji
Furtwangen University, University of Leipzig
N
N. A. Jalal
Furtwangen University
H
H. Arabian
Furtwangen University
N
N. Ding
Furtwangen University
K
Knut Moeller
Furtwangen University, University of Canterbury, University of Freiburg
Weiliang Chen
Weiliang Chen
Alibaba
AI SystemDeep Learning
Q
Q. He
Hikvision Research Institute
L
L. Maier-Hein
German Cancer Research Center
D
D. Stoyanov
University College London
S
S. Speidel
University of Strasbourg
A
A. Jarc
Intuitive Surgical, Inc.