🤖 AI Summary
In robotic-assisted microsurgery, fine instruments—such as needle drivers and forceps—are challenging to segment accurately due to low contrast, resolution degradation, and severe class imbalance, leading to poor segmentation precision and structural discontinuities. To address these issues, this paper proposes MISRA: a novel segmentation framework that (i) enhances RGB inputs via luminance-channel augmentation for improved robustness; (ii) incorporates a skip-attention mechanism to preserve slender structural features; and (iii) introduces an iterative feedback module to enforce geometric continuity across multi-round segmentation. Additionally, we present MicroInstruments—the first high-fidelity, expert-annotated dataset specifically designed for microsurgical instrument segmentation. Evaluated on MicroInstruments, MISRA achieves a 5.37% absolute gain in mean class IoU over prior methods, significantly improving segmentation stability and structural integrity—particularly in instrument contact and occlusion regions—thereby establishing a more reliable visual parsing foundation for minimally invasive surgical scene understanding.
📝 Abstract
Accurate segmentation of thin structures is critical for microsurgical scene understanding but remains challenging due to resolution loss, low contrast, and class imbalance. We propose Microsurgery Instrument Segmentation for Robotic Assistance(MISRA), a segmentation framework that augments RGB input with luminance channels, integrates skip attention to preserve elongated features, and employs an Iterative Feedback Module(IFM) for continuity restoration across multiple passes. In addition, we introduce a dedicated microsurgical dataset with fine-grained annotations of surgical instruments including thin objects, providing a benchmark for robust evaluation Dataset available at https://huggingface.co/datasets/KIST-HARILAB/MISAW-Seg. Experiments demonstrate that MISRA achieves competitive performance, improving the mean class IoU by 5.37% over competing methods, while delivering more stable predictions at instrument contacts and overlaps. These results position MISRA as a promising step toward reliable scene parsing for computer-assisted and robotic microsurgery.