Model selection and real-time skill assessment for suturing in robotic surgery

📅 2026-01-17

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

This work addresses the lack of standardized and real-time feedback in suturing skill assessment during robot-assisted surgery by proposing a multimodal deep learning architecture that integrates kinematic and visual modalities. Leveraging synchronized data collected from the da Vinci surgical system, the model enables fine-grained, dynamic skill prediction aligned with surgical actions. Evaluated through skill-level stratified cross-validation and Spearman correlation coefficients, the proposed approach significantly outperforms unimodal baselines in real-time prediction performance. Furthermore, experiments demonstrate that expert-level training data play a crucial role in enhancing the model’s generalization capability. This study thus presents an effective solution for objective, real-time surgical skill assessment in robotic surgery.

Technology Category

Application Category

📝 Abstract

Automated feedback systems have the potential to provide objective skill assessment for training and evaluation in robot-assisted surgery. In this study, we examine methods to achieve real-time prediction of surgical skill level in real-time based on Objective Structured Assessment of Technical Skills (OSATS) scores. Using data acquired from the da Vinci Surgical System, we carry out three main analyses, focusing on model design, their real-time performance, and their skill-level-based cross-validation training. For the model design, we evaluate the effectiveness of multimodal deep learning models for predicting surgical skill levels using synchronized kinematic and vision data. Our models include separate unimodal baselines and fusion architectures that integrate features from both modalities and are evaluated using mean Spearman's correlation coefficients, demonstrating that the fusion model consistently outperforms unimodal models for real-time predictions. For the real-time performance, we observe the prediction's trend over time and highlight correlation with the surgeon's gestures. For the skill-level-based cross-validation, we separately trained models on surgeons with different skill levels, which showed that high-skill demonstrations allow for better performance than those trained on low-skilled ones and generalize well to similarly skilled participants. Our findings show that multimodal learning allows more stable fine-grained evaluation of surgical performance and highlights the value of expert-level training data for model generalization.

Problem

Research questions and friction points this paper is trying to address.

model selection

real-time skill assessment

robotic surgery

surgical skill evaluation

suturing

Innovation

Methods, ideas, or system contributions that make the work stand out.

multimodal deep learning

real-time skill assessment

robotic surgery