Exploiting DINOv3-Based Self-Supervised Features for Robust Few-Shot Medical Image Segmentation

📅 2026-01-12

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

This work addresses the challenge of scarce annotated data in few-shot medical image segmentation by proposing a novel framework leveraging self-supervised DINOv2 features. To mitigate the domain gap between natural-image pretraining and medical imaging, the method introduces two key components: WT-Aug, a wavelet-based feature augmentation module, and CG-Fuse, a context-guided fusion module. By integrating wavelet-domain feature enhancement with cross-attention mechanisms, the approach enables effective multi-scale contextual fusion at the feature level. Extensive experiments on six public datasets spanning five imaging modalities demonstrate that the proposed method significantly outperforms existing few-shot segmentation approaches, underscoring its robustness and generalization capability across diverse medical imaging domains.

Technology Category

Application Category

📝 Abstract

Deep learning-based automatic medical image segmentation plays a critical role in clinical diagnosis and treatment planning but remains challenging in few-shot scenarios due to the scarcity of annotated training data. Recently, self-supervised foundation models such as DINOv3, which were trained on large natural image datasets, have shown strong potential for dense feature extraction that can help with the few-shot learning challenge. Yet, their direct application to medical images is hindered by domain differences. In this work, we propose DINO-AugSeg, a novel framework that leverages DINOv3 features to address the few-shot medical image segmentation challenge. Specifically, we introduce WT-Aug, a wavelet-based feature-level augmentation module that enriches the diversity of DINOv3-extracted features by perturbing frequency components, and CG-Fuse, a contextual information-guided fusion module that exploits cross-attention to integrate semantic-rich low-resolution features with spatially detailed high-resolution features. Extensive experiments on six public benchmarks spanning five imaging modalities, including MRI, CT, ultrasound, endoscopy, and dermoscopy, demonstrate that DINO-AugSeg consistently outperforms existing methods under limited-sample conditions. The results highlight the effectiveness of incorporating wavelet-domain augmentation and contextual fusion for robust feature representation, suggesting DINO-AugSeg as a promising direction for advancing few-shot medical image segmentation. Code and data will be made available on https://github.com/apple1986/DINO-AugSeg.

Problem

Research questions and friction points this paper is trying to address.

few-shot learning

medical image segmentation

data scarcity

domain shift

self-supervised features

Innovation

Methods, ideas, or system contributions that make the work stand out.

DINOv3

few-shot segmentation

wavelet augmentation