🤖 AI Summary
This work addresses the challenges of monocular geometric estimation in colonoscopy, where ground-truth geometry is difficult to obtain due to intestinal strictures and significant domain shifts exist between synthetic and real data caused by artifacts and illumination discrepancies. To overcome these issues, the authors propose CoGE, a framework that enables online geometric estimation using only synthetic data for training. CoGE innovatively integrates Retinex theory into a lighting-aware supervision module to handle complex illumination conditions and employs a wavelet decomposition-driven structure-aware module to extract domain-invariant structural cues and local features. Experimental results demonstrate that CoGE achieves state-of-the-art performance in geometric estimation on both synthetic and real colonoscopic scenes, validating the effectiveness and strong generalization capability of purely simulation-based training.
📝 Abstract
Geometric estimation including depth estimation and scene reconstruction is a crucial technique for colonoscopy which can provide surgeons with 3D spatial perception and navigation. However, geometric ground truth in colonoscopy is difficult to obtain due to narrow and enclosed space of the colon, while there is a large feature gap between simulated data and realistic data caused by artifacts and illumination. In this paper, we present CoGE, a novel framework for online monocular geometric estimation during colonoscopy. Firstly, we propose an illumination-aware supervision module based on the Retinex theory to address illumination diversity in different colonoscopy scenes. Moreover, a structure-aware perception module is proposed based on wavelet decomposition to extract common structural and local features of the colon. Both quantitative and qualitative results demonstrate that the proposed model solely trained on simulated data achieves state-of-the-art performance in geometric estimation for both simulated and realistic scenes.