🤖 AI Summary
A critical gap exists in publicly available, natively 4K-resolution robotic-assisted minimally invasive surgery (MIS) video datasets, hindering research in high-fidelity visual analysis and precise intraoperative guidance. To address this, we introduce the first open-source, natively 4K robotic surgery video dataset, captured across multiple clinical centers and diverse surgical procedures under real-world, long-duration operating conditions. The dataset comprehensively captures challenging intraoperative phenomena—including specular reflections, instrument occlusions, blood pooling, and smoke—using clinical-grade 4K endoscopic systems. All videos undergo rigorous annotation and quality control to support benchmarking of advanced vision tasks such as video super-resolution, smoke removal, monocular depth estimation, and instance segmentation. This resource establishes a high-quality, standardized benchmark for surgical image enhancement, 3D reconstruction, vision-language modeling, and intelligent navigation systems, thereby filling a longstanding void in high-resolution surgical vision research.
📝 Abstract
High-resolution imaging is crucial for enhancing visual clarity and enabling precise computer-assisted guidance in minimally invasive surgery (MIS). Despite the increasing adoption of 4K endoscopic systems, there remains a significant gap in publicly available native 4K datasets tailored specifically for robotic-assisted MIS. We introduce SurgiSR4K, the first publicly accessible surgical imaging and video dataset captured at a native 4K resolution, representing realistic conditions of robotic-assisted procedures. SurgiSR4K comprises diverse visual scenarios including specular reflections, tool occlusions, bleeding, and soft tissue deformations, meticulously designed to reflect common challenges faced during laparoscopic and robotic surgeries. This dataset opens up possibilities for a broad range of computer vision tasks that might benefit from high resolution data, such as super resolution (SR), smoke removal, surgical instrument detection, 3D tissue reconstruction, monocular depth estimation, instance segmentation, novel view synthesis, and vision-language model (VLM) development. SurgiSR4K provides a robust foundation for advancing research in high-resolution surgical imaging and fosters the development of intelligent imaging technologies aimed at enhancing performance, safety, and usability in image-guided robotic surgeries.