Accelerating Volumetric Medical Image Annotation via Short-Long Memory SAM 2

📅 2025-05-03

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

Manual annotation of medical volumetric images (e.g., MRI/CT) is time-consuming and error-prone. Existing video segmentation foundation models—such as SAM 2—exhibit severe boundary mispropagation across slices, limiting their clinical utility in 3D medical image segmentation. To address this, we propose a novel 3D medical image segmentation framework built upon SAM 2, featuring a dual short-term/long-term memory bank coupled with independent attention modules to explicitly model local slice-wise continuity and global anatomical consistency. Our architecture integrates dual-path memory storage, hierarchical spatiotemporal attention, and modality-adaptive adapters, enabling robust multi-organ 3D segmentation (e.g., organs, bones, muscles). Evaluated on three public benchmarks, our method achieves mean Dice improvements of 0.11–0.14 using only 1–5 annotated samples per organ for fine-tuning. It significantly suppresses over-propagation and enhances boundary robustness, advancing the deployment of high-accuracy automated medical annotation in clinical practice.

Technology Category

Application Category

📝 Abstract

Manual annotation of volumetric medical images, such as magnetic resonance imaging (MRI) and computed tomography (CT), is a labor-intensive and time-consuming process. Recent advancements in foundation models for video object segmentation, such as Segment Anything Model 2 (SAM 2), offer a potential opportunity to significantly speed up the annotation process by manually annotating one or a few slices and then propagating target masks across the entire volume. However, the performance of SAM 2 in this context varies. Our experiments show that relying on a single memory bank and attention module is prone to error propagation, particularly at boundary regions where the target is present in the previous slice but absent in the current one. To address this problem, we propose Short-Long Memory SAM 2 (SLM-SAM 2), a novel architecture that integrates distinct short-term and long-term memory banks with separate attention modules to improve segmentation accuracy. We evaluate SLM-SAM 2 on three public datasets covering organs, bones, and muscles across MRI and CT modalities. We show that the proposed method markedly outperforms the default SAM 2, achieving average Dice Similarity Coefficient improvement of 0.14 and 0.11 in the scenarios when 5 volumes and 1 volume are available for the initial adaptation, respectively. SLM-SAM 2 also exhibits stronger resistance to over-propagation, making a notable step toward more accurate automated annotation of medical images for segmentation model development.

Problem

Research questions and friction points this paper is trying to address.

Reducing labor-intensive manual annotation of volumetric medical images

Improving SAM 2's error-prone boundary region segmentation

Enhancing segmentation accuracy via short-long memory architecture

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates short-term and long-term memory banks

Uses separate attention modules for accuracy

Improves segmentation with SLM-SAM 2 architecture

🔎 Similar Papers

No similar papers found.