EgoSurgery-HTS: A Dataset for Egocentric Hand-Tool Segmentation in Open Surgery Videos

📅 2025-03-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Pixel-level segmentation of hands and surgical instruments in first-person videos of open surgery remains an underexplored challenge. Method: This paper introduces the first hand–instrument interaction segmentation task tailored to this domain and presents EGO-Surg—the first fine-grained egocentric hand–instrument segmentation dataset—featuring pixel-level annotations for 14 instrument categories, both hands, and hand–instrument interaction regions. A novel three-level (instrument/hand/interaction) unified annotation framework is proposed, with annotations rigorously refined by experts through multiple validation rounds. Benchmark evaluations are conducted on mainstream instance segmentation models, including Mask R-CNN. Contribution/Results: Our approach establishes new state-of-the-art performance, improving mAP by 12.6%. The EGO-Surg dataset is publicly released, serving as a foundational resource for visual understanding in open surgery.

Technology Category

Application Category

📝 Abstract
Egocentric open-surgery videos capture rich, fine-grained details essential for accurately modeling surgical procedures and human behavior in the operating room. A detailed, pixel-level understanding of hands and surgical tools is crucial for interpreting a surgeon's actions and intentions. We introduce EgoSurgery-HTS, a new dataset with pixel-wise annotations and a benchmark suite for segmenting surgical tools, hands, and interacting tools in egocentric open-surgery videos. Specifically, we provide a labeled dataset for (1) tool instance segmentation of 14 distinct surgical tools, (2) hand instance segmentation, and (3) hand-tool segmentation to label hands and the tools they manipulate. Using EgoSurgery-HTS, we conduct extensive evaluations of state-of-the-art segmentation methods and demonstrate significant improvements in the accuracy of hand and hand-tool segmentation in egocentric open-surgery videos compared to existing datasets. The dataset will be released at https://github.com/Fujiry0/EgoSurgery.
Problem

Research questions and friction points this paper is trying to address.

Segmenting surgical tools and hands in egocentric open-surgery videos
Improving accuracy of hand-tool interaction segmentation
Providing a benchmark for surgical action interpretation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Pixel-wise annotated dataset for surgical tools
Hand-tool segmentation benchmark suite
Improved accuracy in egocentric surgery videos
🔎 Similar Papers
No similar papers found.
N
Nathan Darjana
Keio University, Yokohama, Kanagawa, Japan
Ryo Fujii
Ryo Fujii
Keio University
Computer VisionMachine LearningTrajectory PredictionVideo Understanding
Hideo Saito
Hideo Saito
Keio University
computer visionaugmented realityimage processing
H
Hiroki Kajita
Keio University School of Medicine, Shinjuku, Tokyo, Japan