CAT-SG: A Large Dynamic Scene Graph Dataset for Fine-Grained Understanding of Cataract Surgery

📅 2025-06-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing cataract surgery datasets primarily target isolated tasks—such as surgical tool detection or phase segmentation—and lack explicit modeling of semantic relationships among tools, anatomical structures, and procedural techniques, as well as temporal dependencies. To address this, we introduce CAT-SG, the first dynamic scene graph dataset for cataract surgery, featuring structured tool–tissue interaction annotations and explicit modeling of surgical workflow variability. We further propose CatSGG, a dedicated scene graph generation model for surgical video understanding, which integrates multimodal video parsing, temporal relation modeling, and graph neural networks to produce fine-grained, temporally aware scene graphs. Experiments demonstrate that CatSGG significantly improves surgical phase recognition accuracy. Moreover, CAT-SG and CatSGG jointly establish an interpretable, context-aware foundation for AI-assisted surgical training, real-time decision support, and surgical workflow analytics.

Technology Category

Application Category

📝 Abstract
Understanding the intricate workflows of cataract surgery requires modeling complex interactions between surgical tools, anatomical structures, and procedural techniques. Existing datasets primarily address isolated aspects of surgical analysis, such as tool detection or phase segmentation, but lack comprehensive representations that capture the semantic relationships between entities over time. This paper introduces the Cataract Surgery Scene Graph (CAT-SG) dataset, the first to provide structured annotations of tool-tissue interactions, procedural variations, and temporal dependencies. By incorporating detailed semantic relations, CAT-SG offers a holistic view of surgical workflows, enabling more accurate recognition of surgical phases and techniques. Additionally, we present a novel scene graph generation model, CatSGG, which outperforms current methods in generating structured surgical representations. The CAT-SG dataset is designed to enhance AI-driven surgical training, real-time decision support, and workflow analysis, paving the way for more intelligent, context-aware systems in clinical practice.
Problem

Research questions and friction points this paper is trying to address.

Modeling complex interactions in cataract surgery workflows
Lack of comprehensive datasets for surgical semantic relationships
Enhancing AI-driven surgical training and decision support
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces CAT-SG dataset for cataract surgery analysis
Develops CatSGG model for structured scene graphs
Enables AI-driven surgical training and decision support
🔎 Similar Papers
No similar papers found.