4D Monocular Surgical Reconstruction under Arbitrary Camera Motions

πŸ“… 2026-02-19
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenge of high-fidelity 4D reconstruction of deformable surgical scenes from monocular endoscopic videos under arbitrary camera motionβ€”a task hindered by the limitations of existing methods that rely on fixed viewpoints, stereo depth, or precise motion initialization, rendering them unsuitable for real clinical settings. To overcome these constraints, we propose Local-EndoGS, a framework that constructs local deformable 3D Gaussian Splatting models within a sliding window. It integrates coarse-to-fine robust initialization, multi-view geometric constraints, monocular depth priors, and cross-window information fusion, further enhanced by long-range pixel trajectories and physical motion priors to ensure plausible deformations. Our method enables scalable 4D reconstruction under arbitrary camera trajectories without requiring stereo inputs or accurate structure-from-motion (SfM). Experiments demonstrate consistent superiority over state-of-the-art approaches across three public deformable endoscopic datasets, with ablation studies confirming the contribution of each component.

Technology Category

Application Category

πŸ“ Abstract
Reconstructing deformable surgical scenes from endoscopic videos is challenging and clinically important. Recent state-of-the-art methods based on implicit neural representations or 3D Gaussian splatting have made notable progress. However, most are designed for deformable scenes with fixed endoscope viewpoints and rely on stereo depth priors or accurate structure-from-motion for initialization and optimization, limiting their ability to handle monocular sequences with large camera motion in real clinical settings. To address this, we propose Local-EndoGS, a high-quality 4D reconstruction framework for monocular endoscopic sequences with arbitrary camera motion. Local-EndoGS introduces a progressive, window-based global representation that allocates local deformable scene models to each observed window, enabling scalability to long sequences with substantial motion. To overcome unreliable initialization without stereo depth or accurate structure-from-motion, we design a coarse-to-fine strategy integrating multi-view geometry, cross-window information, and monocular depth priors, providing a robust foundation for optimization. We further incorporate long-range 2D pixel trajectory constraints and physical motion priors to improve deformation plausibility. Experiments on three public endoscopic datasets with deformable scenes and varying camera motions show that Local-EndoGS consistently outperforms state-of-the-art methods in appearance quality and geometry. Ablation studies validate the effectiveness of our key designs. Code will be released upon acceptance at: https://github.com/IRMVLab/Local-EndoGS.
Problem

Research questions and friction points this paper is trying to address.

4D reconstruction
monocular endoscopy
deformable scenes
arbitrary camera motion
surgical scene
Innovation

Methods, ideas, or system contributions that make the work stand out.

4D surgical reconstruction
monocular endoscopy
arbitrary camera motion
deformable scene modeling
Gaussian splatting
πŸ”Ž Similar Papers
No similar papers found.
J
Jiwei Shan
Department of Mechanical and Automation Engineering and T Stone Robotics Institute, The Chinese University of Hong Kong, Hong Kong.
Zeyu Cai
Zeyu Cai
Institute of Heavy Ion Physics, Peking University
AI for SciencePlasma PhysicsAI AgentsNumber Theory
C
Cheng-Tai Hsieh
School of Automation and Intelligent Sensing, Shanghai Jiao Tong University, Shanghai, China
Y
Yirui Li
Department of Mechanical and Automation Engineering and T Stone Robotics Institute, The Chinese University of Hong Kong, Hong Kong.
Hao Liu
Hao Liu
CASIA
Face Recognition
Lijun Han
Lijun Han
Shanghai Jiaotong University
H
Hesheng Wang
School of Automation and Intelligent Sensing, Shanghai Jiao Tong University, Shanghai, China; Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, China
Shing Shin Cheng
Shing Shin Cheng
Associate Professor, The Chinese University of Hong Kong
Medical RoboticsContinuum RobotsImage-guided SurgeryModeling and control