Sim2Real within 5 Minutes: Efficient Domain Transfer with Stylized Gaussian Splatting for Endoscopic Images

📅 2024-03-16

📈 Citations: 0

✨ Influential: 0

career value

242K/year

🤖 AI Summary

To address the cross-domain alignment challenge between preoperative CT-reconstructed models and intraoperative endoscopic images—caused by significant texture discrepancies—in robot-assisted endoluminal interventions, this paper proposes an efficient, lightweight domain adaptation method. Leveraging only 10 real endoscopic images, it achieves simulation-to-real style transfer within five minutes. Key contributions include: (1) the first stylized Gaussian splatting framework, which decouples and optimizes appearance parameters; (2) a structural consistency loss jointly constraining latent features and depth maps to preserve geometric fidelity; and (3) a differentiable style transfer mechanism applied to heterogeneous Gaussian point clouds. Experiments demonstrate substantial improvements over state-of-the-art methods in pose estimation and navigation tasks, with enhanced matching accuracy and robustness. The method supports real-time intraoperative deployment on standard hardware.

Technology Category

Application Category

📝 Abstract

Robot assisted endoluminal intervention is an emerging technique for both benign and malignant luminal lesions. With vision-based navigation, when combined with pre-operative imaging data as priors, it is possible to recover position and pose of the endoscope without the need of additional sensors. In practice, however, aligning pre-operative and intra-operative domains is complicated by significant texture differences. Although methods such as style transfer can be used to address this issue, they require large datasets from both source and target domains with prolonged training times. This paper proposes an efficient domain transfer method based on stylized Gaussian splatting, only requiring a few of real images (10 images) with very fast training time. Specifically, the transfer process includes two phases. In the first phase, the 3D models reconstructed from CT scans are represented as differential Gaussian point clouds. In the second phase, only color appearance related parameters are optimized to transfer the style and preserve the visual content. A novel structure consistency loss is applied to latent features and depth levels to enhance the stability of the transferred images. Detailed validation was performed to demonstrate the performance advantages of the proposed method compared to that of the current state-of-the-art, highlighting the potential for intra-operative surgical navigation.

Problem

Research questions and friction points this paper is trying to address.

Aligning pre-operative and intra-operative endoscopic images efficiently

Reducing training time and dataset size for domain transfer

Enhancing intra-operative surgical navigation with fast style transfer

Innovation

Methods, ideas, or system contributions that make the work stand out.

Stylized Gaussian splatting for domain transfer

Few real images required for fast training

Structure consistency loss enhances image stability

🔎 Similar Papers

No similar papers found.