🤖 AI Summary
Real-time, precise localization and spatiotemporal tracking of bleeding sources during endoscopic submucosal dissection (ESD) remains a critical challenge for surgical safety. Existing AI approaches predominantly focus on static segmentation and lack robust temporal modeling capabilities for dynamic endoscopic scenes, further hindered by the absence of dedicated benchmark datasets. To address this, we introduce BleedOrigin-Bench—the first publicly available ESD bleeding-source benchmark dataset—and propose BleedOrigin-Net, a two-stage detection-tracking framework. Stage I employs a YOLO-based architecture to detect bleeding onset at the frame level (96.85% accuracy) and localize it pixel-accurately (70.24% IoU). Stage II integrates point-level tracking with pseudo-label augmentation to achieve continuous spatiotemporal bleeding-source tracking (96.11% tracking accuracy). Extensive experiments demonstrate significant superiority over state-of-the-art models. This work establishes a new paradigm for AI-driven intelligent endoscopic hemostatic intervention.
📝 Abstract
Intraoperative bleeding during Endoscopic Submucosal Dissection (ESD) poses significant risks, demanding precise, real-time localization and continuous monitoring of the bleeding source for effective hemostatic intervention. In particular, endoscopists have to repeatedly flush to clear blood, allowing only milliseconds to identify bleeding sources, an inefficient process that prolongs operations and elevates patient risks. However, current Artificial Intelligence (AI) methods primarily focus on bleeding region segmentation, overlooking the critical need for accurate bleeding source detection and temporal tracking in the challenging ESD environment, which is marked by frequent visual obstructions and dynamic scene changes. This gap is widened by the lack of specialized datasets, hindering the development of robust AI-assisted guidance systems. To address these challenges, we introduce BleedOrigin-Bench, the first comprehensive ESD bleeding source dataset, featuring 1,771 expert-annotated bleeding sources across 106,222 frames from 44 procedures, supplemented with 39,755 pseudo-labeled frames. This benchmark covers 8 anatomical sites and 6 challenging clinical scenarios. We also present BleedOrigin-Net, a novel dual-stage detection-tracking framework for the bleeding source localization in ESD procedures, addressing the complete workflow from bleeding onset detection to continuous spatial tracking. We compare with widely-used object detection models (YOLOv11/v12), multimodal large language models, and point tracking methods. Extensive evaluation demonstrates state-of-the-art performance, achieving 96.85% frame-level accuracy ($pmleq8$ frames) for bleeding onset detection, 70.24% pixel-level accuracy ($leq100$ px) for initial source detection, and 96.11% pixel-level accuracy ($leq100$ px) for point tracking.