SuFIA-BC: Generating High Quality Demonstration Data for Visuomotor Policy Learning in Surgical Subtasks

📅 2025-04-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
High-quality demonstration data for surgical subtasks is scarce due to patient privacy constraints, high acquisition costs, and robotic calibration errors. Method: We develop a high-fidelity surgical digital twin platform integrating anatomically accurate organ models with the PhysX/Unity physics engine, augmented by NeRF-driven endoscopic 3D reconstruction and multi-view geometric consistency constraints to generate photorealistic synthetic demonstrations. We further propose a novel multi-view and single-endoscope 3D vision-based behavioral cloning framework tailored for surgical autonomy, introducing a perception–control co-design paradigm specifically adapted to contact-intensive tasks. Results: Experiments reveal that state-of-the-art behavioral cloning methods suffer significant performance degradation on surgical contact tasks. In contrast, our approach—trained exclusively on synthetic data—achieves markedly improved policy generalization and robustness. It establishes a new benchmark for preclinical visuomotor policy training, demonstrating the viability of high-fidelity simulation for bridging the reality gap in autonomous surgical learning.

Technology Category

Application Category

📝 Abstract
Behavior cloning facilitates the learning of dexterous manipulation skills, yet the complexity of surgical environments, the difficulty and expense of obtaining patient data, and robot calibration errors present unique challenges for surgical robot learning. We provide an enhanced surgical digital twin with photorealistic human anatomical organs, integrated into a comprehensive simulator designed to generate high-quality synthetic data to solve fundamental tasks in surgical autonomy. We present SuFIA-BC: visual Behavior Cloning policies for Surgical First Interactive Autonomy Assistants. We investigate visual observation spaces including multi-view cameras and 3D visual representations extracted from a single endoscopic camera view. Through systematic evaluation, we find that the diverse set of photorealistic surgical tasks introduced in this work enables a comprehensive evaluation of prospective behavior cloning models for the unique challenges posed by surgical environments. We observe that current state-of-the-art behavior cloning techniques struggle to solve the contact-rich and complex tasks evaluated in this work, regardless of their underlying perception or control architectures. These findings highlight the importance of customizing perception pipelines and control architectures, as well as curating larger-scale synthetic datasets that meet the specific demands of surgical tasks. Project website: https://orbit-surgical.github.io/sufia-bc/
Problem

Research questions and friction points this paper is trying to address.

Generating high-quality synthetic data for surgical robot learning
Addressing challenges in surgical environments for behavior cloning
Evaluating behavior cloning models for complex surgical tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Enhanced surgical digital twin with photorealistic organs
Multi-view cameras and 3D visual representations
Customized perception pipelines and control architectures
🔎 Similar Papers
No similar papers found.