🤖 AI Summary
To address inaccurate slave-arm position estimation in telesurgical robotics under the Tactile Internet—caused by network latency, jitter, and packet loss—this paper proposes a joint prediction framework integrating Informer with a four-state Hidden Markov Model (4-State HMM). Methodologically, it introduces a differentiable optimization layer into time-series forecasting to explicitly enforce energy efficiency, motion smoothness, and robustness constraints. It enhances Informer’s modeling capacity via ProbSparse self-attention and knowledge distillation, and employs a generative decoder to improve prediction stability. Evaluated on the JIGSAWS dataset, the framework achieves >90% position prediction accuracy—significantly outperforming TCN, RNN, and LSTM baselines. Crucially, it satisfies stringent telesurgery requirements across diverse network impairment scenarios: end-to-end latency <100 ms and packet loss rate ≤15%, thereby ensuring both real-time responsiveness and operational reliability.
📝 Abstract
Precise and real-time estimation of the robotic arm's position on the patient's side is essential for the success of remote robotic surgery in Tactile Internet (TI) environments. This paper presents a prediction model based on the Transformer-based Informer framework for accurate and efficient position estimation. Additionally, it combines a Four-State Hidden Markov Model (4-State HMM) to simulate realistic packet loss scenarios. The proposed approach addresses challenges such as network delays, jitter, and packet loss to ensure reliable and precise operation in remote surgical applications. The method integrates the optimization problem into the Informer model by embedding constraints such as energy efficiency, smoothness, and robustness into its training process using a differentiable optimization layer. The Informer framework uses features such as ProbSparse attention, attention distilling, and a generative-style decoder to focus on position-critical features while maintaining a low computational complexity of O(L log L). The method is evaluated using the JIGSAWS dataset, achieving a prediction accuracy of over 90 percent under various network scenarios. A comparison with models such as TCN, RNN, and LSTM demonstrates the Informer framework's superior performance in handling position prediction and meeting real-time requirements, making it suitable for Tactile Internet-enabled robotic surgery.