SAMIR, an efficient registration framework via robust feature learning from SAM

📅 2025-09-16

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

In medical image registration, anatomical structure-aware feature extraction is critical for accurate deformation modeling; however, existing weakly supervised methods rely on scarce ground-truth segmentations or landmarks. This paper proposes SAMIR—the first framework to integrate the Segment Anything Model (SAM), a vision foundation model, into unsupervised medical image registration. SAMIR employs task-adaptive anatomical feature extraction, a lightweight 3D decoder, and a hierarchical feature consistency loss to achieve high-precision anatomical alignment without any additional annotations. Its core innovation lies in leveraging SAM’s pretrained encoder to encode generic anatomical priors, enhanced via embedding-space optimization and multi-scale feature constraints to improve the anatomical plausibility of deformation fields. Evaluated on ACDC and abdominal CT datasets, SAMIR outperforms state-of-the-art methods by 2.68% and 6.44%, respectively, significantly improving registration robustness and clinical interpretability.

Technology Category

Application Category

📝 Abstract

Image registration is a fundamental task in medical image analysis. Deformations are often closely related to the morphological characteristics of tissues, making accurate feature extraction crucial. Recent weakly supervised methods improve registration by incorporating anatomical priors such as segmentation masks or landmarks, either as inputs or in the loss function. However, such weak labels are often not readily available, limiting their practical use. Motivated by the strong representation learning ability of visual foundation models, this paper introduces SAMIR, an efficient medical image registration framework that utilizes the Segment Anything Model (SAM) to enhance feature extraction. SAM is pretrained on large-scale natural image datasets and can learn robust, general-purpose visual representations. Rather than using raw input images, we design a task-specific adaptation pipeline using SAM's image encoder to extract structure-aware feature embeddings, enabling more accurate modeling of anatomical consistency and deformation patterns. We further design a lightweight 3D head to refine features within the embedding space, adapting to local deformations in medical images. Additionally, we introduce a Hierarchical Feature Consistency Loss to guide coarse-to-fine feature matching and improve anatomical alignment. Extensive experiments demonstrate that SAMIR significantly outperforms state-of-the-art methods on benchmark datasets for both intra-subject cardiac image registration and inter-subject abdomen CT image registration, achieving performance improvements of 2.68% on ACDC and 6.44% on the abdomen dataset. The source code will be publicly available on GitHub following the acceptance of this paper.

Problem

Research questions and friction points this paper is trying to address.

Efficient medical image registration without weak labels

Robust feature extraction using Segment Anything Model

Accurate anatomical alignment via hierarchical feature consistency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages SAM's encoder for structure-aware feature embeddings

Introduces lightweight 3D head for local deformation adaptation

Uses Hierarchical Feature Consistency Loss for anatomical alignment

🔎 Similar Papers

No similar papers found.