ArthroCut: Autonomous Policy Learning for Robotic Bone Resection in Knee Arthroplasty

๐Ÿ“… 2026-03-04
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the limited contextual awareness and autonomous decision-making capabilities of current robotic systems in total knee arthroplasty, which hinder precise bone resection. The authors propose ArthroCut, a novel framework that integrates preoperative imaging with intraoperative multimodal dataโ€”including CT/MR scans, NDI tracking, RGB-D video, robotic states, and textual surgical intent. By introducing Preoperative Imaging Tokens (PIT) and Temporally Aligned Surgical Tokens (TAST), and incorporating syntactic and safety-aware decoding constraints, ArthroCut enables interpretable and highly reliable autonomous planning and execution of six standard bone cuts. Leveraging a Qwen-VL backbone for multimodal tokenization, the method achieves an average success rate of 86% across seven benchtop experiments, significantly outperforming baseline approaches and demonstrating the efficacy of the proposed multimodal alignment and constrained action generation mechanism.

Technology Category

Application Category

๐Ÿ“ Abstract
Despite rapid commercialization of surgical robots, their autonomy and real-time decision-making remain limited in practice. To address this gap, we propose ArthroCut, an autonomous policy learning framework that upgrades knee arthroplasty robots from assistive execution to context-aware action generation. ArthroCut fine-tunes a Qwen--VL backbone on a self-built, time-synchronized multimodal dataset from 21 complete cases (23,205 RGB--D pairs), integrating preoperative CT/MR, intraoperative NDI tracking of bones and end effector, RGB--D surgical video, robot state, and textual intent. The method operates on two complementary token families -- Preoperative Imaging Tokens (PIT) to encode patient-specific anatomy and planned resection planes, and Time-Aligned Surgical Tokens (TAST) to fuse real-time visual, geometric, and kinematic evidence -- and emits an interpretable action grammar under grammar/safety-constrained decoding. In bench-top experiments on a knee prosthesis across seven trials, ArthroCut achieves an average success rate of 86% over the six standard resections, significantly outperforming strong baselines trained under the same protocol. Ablations show that TAST is the principal driver of reliability while PIT provides essential anatomical grounding, and their combination yields the most stable multi-plane execution. These results indicate that aligning preoperative geometry with time-aligned intraoperative perception and translating that alignment into tokenized, constrained actions is an effective path toward robust, interpretable autonomy in orthopedic robotic surgery.
Problem

Research questions and friction points this paper is trying to address.

surgical robot autonomy
knee arthroplasty
real-time decision-making
bone resection
context-aware action generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

autonomous policy learning
multimodal surgical perception
tokenized action generation
context-aware robotic surgery
preoperative-intraoperative alignment
๐Ÿ”Ž Similar Papers
No similar papers found.
Xu Lu
Xu Lu
School of Computer Science and Technology, Xidian University
formal methodsAI planningprogram verification
Yiling Zhang
Yiling Zhang
Assistant Professor of Industrial and Systems Engineering, University of Minnesota
Stochastic ProgramInteger ProgramPower SystemsTransportation
W
Wenquan Cheng
School of Biomedical Engineering, Tsinghua University, Beijing 100084, China
L
Longfei Ma
School of Biomedical Engineering, Tsinghua University, Beijing 100084, China
F
Fang Chen
School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
H
Hongen Liao
School of Biomedical Engineering, Tsinghua University, Beijing 100084, China; School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China