SWAG: Long-term Surgical Workflow Prediction with Generative-based Anticipation

📅 2024-12-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing surgical video analysis methods only recognize the current procedural step, lacking long-term, multi-step prospective prediction capabilities—limiting real-time intraoperative decision support. This work introduces the first framework capable of predicting surgical phase sequences up to 15 minutes ahead and regressing remaining procedure time, jointly modeling recognition and anticipation. We propose a generative single-step and autoregressive dual-decoding architecture, incorporate domain-specific prior knowledge via an embedding mechanism, and introduce the novel “Regression-to-Classification” (R2C) mapping strategy to overcome challenges in long-horizon, multi-phase, dynamic workflow forecasting. On AutoLaparo21, our method achieves 53.5% accuracy for 15-minute phase prediction; on Cholec80, the R2C variant attains 60.8% accuracy, with a weighted mean absolute error of only 0.32 minutes (within a 2-minute window) for remaining time estimation—substantially outperforming state-of-the-art approaches.

Technology Category

Application Category

📝 Abstract
While existing recognition approaches excel at identifying current surgical phases, they provide limited foresight into future procedural steps, restricting their intraoperative utility. Similarly, current anticipation methods are constrained to predicting short-term events or singular future occurrences, neglecting the dynamic and sequential nature of surgical workflows. To address these limitations, we propose SWAG (Surgical Workflow Anticipative Generation), a unified framework for phase recognition and long-term anticipation of surgical workflows. SWAG employs two generative decoding methods -- single-pass (SP) and auto-regressive (AR) -- to predict sequences of future surgical phases. A novel prior knowledge embedding mechanism enhances the accuracy of anticipatory predictions. The framework addresses future phase classification and remaining time regression tasks. Additionally, a regression-to-classification (R2C) method is introduced to map continuous predictions to discrete temporal segments. SWAG's performance was evaluated on the Cholec80 and AutoLaparo21 datasets. The single-pass classification model with prior knowledge embeddings (SWAG-SP*) achieved 53.5% accuracy in 15-minute anticipation on AutoLaparo21, while the R2C model reached 60.8% accuracy on Cholec80. SWAG's single-pass regression approach outperformed existing methods for remaining time prediction, achieving weighted mean absolute errors of 0.32 and 0.48 minutes for 2- and 3-minute horizons, respectively. SWAG demonstrates versatility across classification and regression tasks, offering robust tools for real-time surgical workflow anticipation. By unifying recognition and anticipatory capabilities, SWAG provides actionable predictions to enhance intraoperative decision-making.
Problem

Research questions and friction points this paper is trying to address.

Surgical Prediction
Long-term Forecasting
Multi-step Prediction
Innovation

Methods, ideas, or system contributions that make the work stand out.

SWAG
Predictive Technology
Surgical Decision Support
🔎 Similar Papers
No similar papers found.
M
Maxence Boels
Surgical and Interventional Engineering, King’s College London
Y
Yang Liu
Surgical and Interventional Engineering, King’s College London
Prokar Dasgupta
Prokar Dasgupta
King's Health Partners Professor of Surgery
robotic surgerysimulationimmunologyprostate canceroveractive bladder
Alejandro Granados
Alejandro Granados
KCL
Surgical Data ScienceGenerative ModelsCausal AI
S
Sébastien Ourselin
Surgical and Interventional Engineering, King’s College London