🤖 AI Summary
This work addresses the limitations of existing endoscopic scene reconstruction methods, which often lack physical modeling and struggle to achieve high-fidelity dynamic simulation. The authors propose a novel 4D Gaussian splatting framework guided by a multimodal large language model (MLLM), which, for the first time, integrates MLLM with differentiable material point method to automatically infer and optimize object-level material parameters. This enables physically consistent, deformable reconstruction of both tissues and instruments. The approach leverages joint supervision from pretrained segmentation, depth estimation, and optical flow, achieving significant improvements over current methods on both public and internal datasets. The proposed method sets a new state of the art in simulation fidelity and physical accuracy for endoscopic environments.
📝 Abstract
In robot-assisted minimally invasive surgery, high-fidelity dynamic endoscopic scene reconstruction and simulation are crucial to enhancing downstream tasks and advancing surgical outcomes. However, existing methods primarily focus on visual reconstruction, lacking physics-based descriptions of the scene required for realistic simulation. We propose a unified framework that achieves physics-aware reconstruction and physical simulation of endoscopic scenes through Multi-modal Large Language Models (MLLMs)-guided Gaussian Splatting. Our approach utilizes 4D Gaussian Splatting (4DGS) integrated with pre-trained segmentation and depth estimation to represent deformable tissues and tools. To achieve automatic inference of physical properties, we introduce an object-wise material field that initializes material parameters via MLLM and refines them through a differentiable Material Point Method (MPM) under joint supervision from rendered images and optical flow. Validated on both open-source and in-house datasets, our framework achieves superior simulation fidelity and physical accuracy compared to state-of-the-art methods, underscoring its potential to advance robot-assisted surgical applications.