YOLOv10-Based Multi-Task Framework for Hand Localization and Laterality Classification in Surgical Videos

📅 2026-02-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a multi-task perception framework based on YOLOv10 to address the challenges of real-time hand localization and left–right hand classification in trauma surgery videos. It represents the first application of YOLOv10 to surgical scenarios, leveraging a shared backbone with dedicated detection heads to simultaneously perform hand detection and handedness classification. Tailored data augmentation strategies are incorporated to effectively mitigate issues such as motion blur, illumination variations, and hand appearance diversity. Evaluated on the Trauma THOMPSON Challenge 2025 Task 2 dataset, the model achieves classification accuracies of 67% for the left hand and 71% for the right hand, with an mAP@[0.5:0.95] of 0.33, while maintaining real-time inference speed. This approach establishes a new paradigm for intraoperative human–robot interaction analysis.

Technology Category

Application Category

📝 Abstract
Real-time hand tracking in trauma surgery is essential for supporting rapid and precise intraoperative decisions. We propose a YOLOv10-based framework that simultaneously localizes hands and classifies their laterality (left or right) in complex surgical scenes. The model is trained on the Trauma THOMPSON Challenge 2025 Task 2 dataset, consisting of first-person surgical videos with annotated hand bounding boxes. Extensive data augmentation and a multi-task detection design improve robustness against motion blur, lighting variations, and diverse hand appearances. Evaluation demonstrates accurate left-hand (67\%) and right-hand (71\%) classification, while distinguishing hands from the background remains challenging. The model achieves an $mAP_{[0.5:0.95]}$ of 0.33 and maintains real-time inference, highlighting its potential for intraoperative deployment. This work establishes a foundation for advanced hand-instrument interaction analysis in emergency surgical procedures.
Problem

Research questions and friction points this paper is trying to address.

hand localization
laterality classification
surgical videos
real-time hand tracking
trauma surgery
Innovation

Methods, ideas, or system contributions that make the work stand out.

YOLOv10
multi-task learning
hand localization
laterality classification
surgical video analysis
🔎 Similar Papers
2024-01-16Medical Image AnalysisCitations: 2
K
Kedi Sun
School of Engineering, College of Engineering and Physical Sciences, University of Birmingham, Birmingham, UK
Le Zhang
Le Zhang
Assistant Professor, University of Birmingham
Medical Image ComputingGenerative AIMedical LLMsDigital Healthcare