๐ค AI Summary
Existing 3D colonoscopy datasets are severely limited, hindering quantitative evaluation and clinical robustness validation of 3D colonic reconstruction algorithms. To address this, we introduce the first high-fidelity, multi-modal ground-truth-annotated 3D colonoscopy video dataset. It comprises 192 videos captured on an anatomically accurate silicone colon phantom, with synchronized pixel-level ground truthโincluding depth maps, surface normals, optical flow, camera poses, and coverage maps. The dataset explicitly incorporates clinically relevant challenges such as polyps, mucosal folds, motion blur, and non-uniform illumination. We publicly release 169 fully annotated videos alongside 8 physician-simulated screening videos. This resource enables algorithm training, ablation studies, and fair benchmarking, establishing the first reproducible, verifiable, and standardized evaluation platform for 3D colonic reconstruction research.
๐ Abstract
Computer vision techniques have the potential to improve the diagnostic performance of colonoscopy, but the lack of 3D colonoscopy datasets for training and validation hinders their development. This paper introduces C3VDv2, the second version (v2) of the high-definition Colonoscopy 3D Video Dataset, featuring enhanced realism designed to facilitate the quantitative evaluation of 3D colon reconstruction algorithms. 192 video sequences were captured by imaging 60 unique, high-fidelity silicone colon phantom segments. Ground truth depth, surface normals, optical flow, occlusion, six-degree-of-freedom pose, coverage maps, and 3D models are provided for 169 colonoscopy videos. Eight simulated screening colonoscopy videos acquired by a gastroenterologist are provided with ground truth poses. The dataset includes 15 videos featuring colon deformations for qualitative assessment. C3VDv2 emulates diverse and challenging scenarios for 3D reconstruction algorithms, including fecal debris, mucous pools, blood, debris obscuring the colonoscope lens, en-face views, and fast camera motion. The enhanced realism of C3VDv2 will allow for more robust and representative development and evaluation of 3D reconstruction algorithms.