SAM2LoRA: Composite Loss-Guided, Parameter-Efficient Finetuning of SAM2 for Retinal Fundus Segmentation

📅 2025-10-11

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

To address the challenges of excessive fine-tuning parameters and poor cross-dataset generalization of SAM2 in retinal fundus image segmentation, this paper proposes a parameter-efficient fine-tuning framework. Specifically, low-rank adapters (LoRA) are integrated into both the image encoder and mask decoder, while a multi-scale feature fusion decoding mechanism is introduced to enhance structural detail recovery. Additionally, a composite loss function—combining binary cross-entropy (BCE), SoftDice, and Focal Tversky losses—is designed to improve boundary sensitivity and robustness against class imbalance. With fewer than 5% trainable parameters, the method achieves Dice scores of 0.86 (vessel) and 0.93 (optic cup) across 11 retinal datasets, and AUCs up to 0.98 and 0.99, respectively—outperforming state-of-the-art approaches. Training overhead is reduced by over 80%, demonstrating significant efficiency and effectiveness gains.

Technology Category

Application Category

📝 Abstract

We propose SAM2LoRA, a parameter-efficient fine-tuning strategy that adapts the Segment Anything Model 2 (SAM2) for fundus image segmentation. SAM2 employs a masked autoencoder-pretrained Hierarchical Vision Transformer for multi-scale feature decoding, enabling rapid inference in low-resource settings; however, fine-tuning remains challenging. To address this, SAM2LoRA integrates a low-rank adapter into both the image encoder and mask decoder, requiring fewer than 5% of the original trainable parameters. Our analysis indicates that for cross-dataset fundus segmentation tasks, a composite loss function combining segmentationBCE, SoftDice, and FocalTversky losses is essential for optimal network tuning. Evaluated on 11 challenging fundus segmentation datasets, SAM2LoRA demonstrates high performance in both blood vessel and optic disc segmentation under cross-dataset training conditions. It achieves Dice scores of up to 0.86 and 0.93 for blood vessel and optic disc segmentation, respectively, and AUC values of up to 0.98 and 0.99, achieving state-of-the-art performance while substantially reducing training overhead.

Problem

Research questions and friction points this paper is trying to address.

Adapting SAM2 for retinal fundus image segmentation efficiently

Reducing trainable parameters by over 95% with low-rank adapters

Optimizing cross-dataset segmentation using composite loss functions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Low-rank adapters integrated into encoder and decoder

Composite loss combining BCE, SoftDice and FocalTversky

Parameter-efficient fine-tuning requiring under 5% trainable parameters

🔎 Similar Papers

No similar papers found.