🤖 AI Summary
Dual-arm robotic suturing teleoperation faces two fundamental bottlenecks: poor generalizability of fully autonomous systems and performance limitations of purely manual systems—namely, susceptibility to communication latency, perceptual deficits, and motion instability. To address these challenges, this paper proposes a hierarchical interactive control framework. At the high level, a Transformer-based model performs real-time recognition of surgical action units (surgemes) to infer high-level surgical intent. At the low level, a confidence-weighted intent fusion controller enables shared autonomy by integrating human input and robotic execution for motion control. Our key innovation lies in jointly leveraging semantic-level surgeme recognition and dynamics-aware multimodal sensor fusion—integrating kinematic and force-torque data—to establish a real-time closed-loop collaborative architecture. Cross-proficiency experiments demonstrate statistically significant improvements: reduced task completion time and enhanced user satisfaction (p < 0.01).
📝 Abstract
Robotic-assisted procedures offer enhanced precision, but while fully autonomous systems are limited in task knowledge, difficulties in modeling unstructured environments, and generalisation abilities, fully manual teleoperated systems also face challenges such as delay, stability, and reduced sensory information. To address these, we developed an interactive control strategy that assists the human operator by predicting their motion plan at both high and low levels. At the high level, a surgeme recognition system is employed through a Transformer-based real-time gesture classification model to dynamically adapt to the operator's actions, while at the low level, a Confidence-based Intention Assimilation Controller adjusts robot actions based on user intent and shared control paradigms. The system is built around a robotic suturing task, supported by sensors that capture the kinematics of the robot and task dynamics. Experiments across users with varying skill levels demonstrated the effectiveness of the proposed approach, showing statistically significant improvements in task completion time and user satisfaction compared to traditional teleoperation.