🤖 AI Summary
Existing gynecological laparoscopic surgery datasets suffer from limited scale, task singularity, and coarse-grained annotations, hindering end-to-end surgical workflow modeling and interpretable analysis. To address these limitations, we introduce Gyn-Lap—the first large-scale, multi-task, fine-grained multimodal dataset for gynecological laparoscopy—supporting four core tasks: surgical action recognition, pixel-level semantic segmentation, structured intraoperative report generation, and novel procedure discovery. Our methodology features holistic, multi-center, hierarchical joint annotation (frame-level actions + pixel-level segmentation + structured textual reports) and establishes standardized acquisition and inter-annotator consistency verification protocols. The publicly released dataset comprises thousands of high-quality surgical videos with comprehensive annotations. Benchmark experiments demonstrate substantial improvements in state-of-the-art model generalization: +5.2% accuracy in action recognition and +4.8% mIoU in semantic segmentation.
📝 Abstract
Recent advances in deep learning have transformed computer-assisted intervention and surgical video analysis, driving improvements not only in surgical training, intraoperative decision support, and patient outcomes, but also in postoperative documentation and surgical discovery. Central to these developments is the availability of large, high-quality annotated datasets. In gynecologic laparoscopy, surgical scene understanding and action recognition are fundamental for building intelligent systems that assist surgeons during operations and provide deeper analysis after surgery. However, existing datasets are often limited by small scale, narrow task focus, or insufficiently detailed annotations, limiting their utility for comprehensive, end-to-end workflow analysis. To address these limitations, we introduce GynSurg, the largest and most diverse multi-task dataset for gynecologic laparoscopic surgery to date. GynSurg provides rich annotations across multiple tasks, supporting applications in action recognition, semantic segmentation, surgical documentation, and discovery of novel procedural insights. We demonstrate the dataset quality and versatility by benchmarking state-of-the-art models under a standardized training protocol. To accelerate progress in the field, we publicly release the GynSurg dataset and its annotations