SAMora: Enhancing SAM through Hierarchical Self-Supervised Pre-Training for Medical Images

📅 2025-11-09

📈 Citations: 0

✨ Influential: 0

career value

159K/year

🤖 AI Summary

To address the performance limitations of Segment Anything Model (SAM)-based approaches under few-shot medical image segmentation and the underutilization of hierarchical prior knowledge in medical data, this paper proposes a multi-level self-supervised pre-training framework. Methodologically, it introduces (1) an HL-Attn hierarchical attention fusion module to jointly model image-, patch-, and pixel-level features; (2) a tri-scale self-supervised learning objective that jointly captures structured semantics and fine-grained local details; and (3) architecture-agnostic adaptability—supporting SAM2, SAMed, and H-SAM—with efficient LoRA-based fine-tuning. Evaluated on Synapse, LA, and PROMISE12 benchmarks, our method achieves significant improvements over state-of-the-art methods under few-shot settings and reduces full-supervision fine-tuning epochs by 90%. Results demonstrate that explicit hierarchical knowledge representation substantially enhances generalizability in medical image segmentation.

Technology Category

Application Category

📝 Abstract

The Segment Anything Model (SAM) has demonstrated significant potential in medical image segmentation. Yet, its performance is limited when only a small amount of labeled data is available, while there is abundant valuable yet often overlooked hierarchical information in medical data. To address this limitation, we draw inspiration from self-supervised learning and propose SAMora, an innovative framework that captures hierarchical medical knowledge by applying complementary self-supervised learning objectives at the image, patch, and pixel levels. To fully exploit the complementarity of hierarchical knowledge within LoRAs, we introduce HL-Attn, a hierarchical fusion module that integrates multi-scale features while maintaining their distinct characteristics. SAMora is compatible with various SAM variants, including SAM2, SAMed, and H-SAM. Experimental results on the Synapse, LA, and PROMISE12 datasets demonstrate that SAMora outperforms existing SAM variants. It achieves state-of-the-art performance in both few-shot and fully supervised settings while reducing fine-tuning epochs by 90%. The code is available at https://github.com/ShChen233/SAMora.

Problem

Research questions and friction points this paper is trying to address.

Enhancing SAM's medical segmentation with limited labeled data

Exploiting hierarchical information in medical images via self-supervised learning

Improving performance in few-shot and fully supervised medical segmentation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical self-supervised pre-training for medical images

Multi-level fusion module integrating multi-scale features

Compatible framework reducing fine-tuning epochs by 90%

🔎 Similar Papers

How to build the best medical image segmentation algorithm using foundation models: a comprehensive empirical study with Segment Anything Model