SuFIA-BC: Generating High Quality Demonstration Data for Visuomotor Policy Learning in Surgical Subtasks

📅 2025-04-21

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

High-quality demonstration data for surgical subtasks is scarce due to patient privacy constraints, high acquisition costs, and robotic calibration errors. Method: We develop a high-fidelity surgical digital twin platform integrating anatomically accurate organ models with the PhysX/Unity physics engine, augmented by NeRF-driven endoscopic 3D reconstruction and multi-view geometric consistency constraints to generate photorealistic synthetic demonstrations. We further propose a novel multi-view and single-endoscope 3D vision-based behavioral cloning framework tailored for surgical autonomy, introducing a perception–control co-design paradigm specifically adapted to contact-intensive tasks. Results: Experiments reveal that state-of-the-art behavioral cloning methods suffer significant performance degradation on surgical contact tasks. In contrast, our approach—trained exclusively on synthetic data—achieves markedly improved policy generalization and robustness. It establishes a new benchmark for preclinical visuomotor policy training, demonstrating the viability of high-fidelity simulation for bridging the reality gap in autonomous surgical learning.

Technology Category

Application Category

📝 Abstract

Behavior cloning facilitates the learning of dexterous manipulation skills, yet the complexity of surgical environments, the difficulty and expense of obtaining patient data, and robot calibration errors present unique challenges for surgical robot learning. We provide an enhanced surgical digital twin with photorealistic human anatomical organs, integrated into a comprehensive simulator designed to generate high-quality synthetic data to solve fundamental tasks in surgical autonomy. We present SuFIA-BC: visual Behavior Cloning policies for Surgical First Interactive Autonomy Assistants. We investigate visual observation spaces including multi-view cameras and 3D visual representations extracted from a single endoscopic camera view. Through systematic evaluation, we find that the diverse set of photorealistic surgical tasks introduced in this work enables a comprehensive evaluation of prospective behavior cloning models for the unique challenges posed by surgical environments. We observe that current state-of-the-art behavior cloning techniques struggle to solve the contact-rich and complex tasks evaluated in this work, regardless of their underlying perception or control architectures. These findings highlight the importance of customizing perception pipelines and control architectures, as well as curating larger-scale synthetic datasets that meet the specific demands of surgical tasks. Project website: https://orbit-surgical.github.io/sufia-bc/

Problem

Research questions and friction points this paper is trying to address.

Generating high-quality synthetic data for surgical robot learning

Addressing challenges in surgical environments for behavior cloning

Evaluating behavior cloning models for complex surgical tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Enhanced surgical digital twin with photorealistic organs

Multi-view cameras and 3D visual representations

Customized perception pipelines and control architectures

🔎 Similar Papers

SurgicAI: A Hierarchical Platform for Fine-Grained Surgical Policy Learning and Benchmarking