EndoVGGT: GNN-Enhanced Depth Estimation for Surgical 3D Reconstruction

๐Ÿ“… 2026-03-25
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the challenge of geometric discontinuities in soft-tissue 3D reconstruction within surgical scenes, caused by low texture, specular reflections, and instrument occlusions, which existing fixed-topology methods struggle to model. To overcome this, the authors propose the EndoVGGT framework, featuring a Deformation-aware Graph Attention (DeGAT) module that replaces static neighborhood structures with dynamically constructed semantic graphs in feature space, effectively capturing long-range dependencies among tissue regions. This enables structure propagation under occlusion and recovery of non-rigid deformations. By integrating graph neural networkโ€“based depth estimation, dynamic graph construction, and geometric prior learning, the method achieves zero-shot cross-domain generalization. Evaluated on the SCARED dataset, it improves PSNR by 24.6% and SSIM by 9.1%, demonstrating strong generalization on both unseen SCARED and EndoNeRF datasets.

Technology Category

Application Category

๐Ÿ“ Abstract
Accurate 3D reconstruction of deformable soft tissues is essential for surgical robotic perception. However, low-texture surfaces, specular highlights, and instrument occlusions often fragment geometric continuity, posing a challenge for existing fixed-topology approaches. To address this, we propose EndoVGGT, a geometry-centric framework equipped with a Deformation-aware Graph Attention (DeGAT) module. Rather than using static spatial neighborhoods, DeGAT dynamically constructs feature-space semantic graphs to capture long-range correlations among coherent tissue regions. This enables robust propagation of structural cues across occlusions, enforcing global consistency and improving non-rigid deformation recovery. Extensive experiments on SCARED show that our method significantly improves fidelity, increasing PSNR by 24.6% and SSIM by 9.1% over prior state-of-the-art. Crucially, EndoVGGT exhibits strong zero-shot cross-dataset generalization to the unseen SCARED and EndoNeRF domains, confirming that DeGAT learns domain-agnostic geometric priors. These results highlight the efficacy of dynamic feature-space modeling for consistent surgical 3D reconstruction.
Problem

Research questions and friction points this paper is trying to address.

3D reconstruction
deformable soft tissues
occlusions
low-texture surfaces
surgical robotics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Deformation-aware Graph Attention
dynamic feature-space graph
surgical 3D reconstruction
non-rigid deformation recovery
zero-shot generalization
๐Ÿ”Ž Similar Papers
No similar papers found.
F
Falong Fan
University of Arizona, Tucson, AZ, USA
Yi Xie
Yi Xie
University of Arizona
Multi-agent System
A
Arnis Lektauers
University of Arizona, Tucson, AZ, USA
Bo Liu
Bo Liu
University of Arizona, AAAI SM, IEEE SM
Reinforcement LearningAgentic AINeuroSymbolic AITrustworthy AI
J
Jerzy Rozenblit
University of Arizona, Tucson, AZ, USA