EndoVGGT: GNN-Enhanced Depth Estimation for Surgical 3D Reconstruction

📅 2026-03-25

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

This work addresses the challenge of geometric discontinuities in soft-tissue 3D reconstruction within surgical scenes, caused by low texture, specular reflections, and instrument occlusions, which existing fixed-topology methods struggle to model. To overcome this, the authors propose the EndoVGGT framework, featuring a Deformation-aware Graph Attention (DeGAT) module that replaces static neighborhood structures with dynamically constructed semantic graphs in feature space, effectively capturing long-range dependencies among tissue regions. This enables structure propagation under occlusion and recovery of non-rigid deformations. By integrating graph neural network–based depth estimation, dynamic graph construction, and geometric prior learning, the method achieves zero-shot cross-domain generalization. Evaluated on the SCARED dataset, it improves PSNR by 24.6% and SSIM by 9.1%, demonstrating strong generalization on both unseen SCARED and EndoNeRF datasets.

Technology Category

Application Category

📝 Abstract

Accurate 3D reconstruction of deformable soft tissues is essential for surgical robotic perception. However, low-texture surfaces, specular highlights, and instrument occlusions often fragment geometric continuity, posing a challenge for existing fixed-topology approaches. To address this, we propose EndoVGGT, a geometry-centric framework equipped with a Deformation-aware Graph Attention (DeGAT) module. Rather than using static spatial neighborhoods, DeGAT dynamically constructs feature-space semantic graphs to capture long-range correlations among coherent tissue regions. This enables robust propagation of structural cues across occlusions, enforcing global consistency and improving non-rigid deformation recovery. Extensive experiments on SCARED show that our method significantly improves fidelity, increasing PSNR by 24.6% and SSIM by 9.1% over prior state-of-the-art. Crucially, EndoVGGT exhibits strong zero-shot cross-dataset generalization to the unseen SCARED and EndoNeRF domains, confirming that DeGAT learns domain-agnostic geometric priors. These results highlight the efficacy of dynamic feature-space modeling for consistent surgical 3D reconstruction.

Problem

Research questions and friction points this paper is trying to address.

3D reconstruction

deformable soft tissues

occlusions

low-texture surfaces

surgical robotics

Innovation

Methods, ideas, or system contributions that make the work stand out.

Deformation-aware Graph Attention

dynamic feature-space graph

surgical 3D reconstruction