🤖 AI Summary
Ultrasound image segmentation suffers from speckle noise and imaging artifacts, leading to blurred boundaries and structural distortions that severely hinder accuracy and generalization. To address DINOv3’s insensitivity to ultrasound boundary degradation—stemming from its natural-image pretraining—this work proposes a frequency-guided adaptive segmentation framework. First, we introduce a novel multi-scale frequency-domain disentanglement and learnable alignment mechanism (via FFT/DWT) to enhance boundary sensitivity. Second, we design a frequency-driven boundary refinement module that extracts structural priors through boundary prototype clustering. Third, we construct a boundary-semantic collaborative multi-task decoder to enforce structural consistency. Evaluated across multiple ultrasound datasets, our method achieves a 4.2% improvement in boundary Dice score (BDSC) and a 9.8% gain in cross-domain generalization performance, significantly outperforming state-of-the-art approaches. The code is publicly available.
📝 Abstract
Ultrasound image segmentation is pivotal for clinical diagnosis, yet challenged by speckle noise and imaging artifacts. Recently, DINOv3 has shown remarkable promise in medical image segmentation with its powerful representation capabilities. However, DINOv3, pre-trained on natural images, lacks sensitivity to ultrasound-specific boundary degradation. To address this limitation, we propose FreqDINO, a frequency-guided segmentation framework that enhances boundary perception and structural consistency. Specifically, we devise a Multi-scale Frequency Extraction and Alignment (MFEA) strategy to separate low-frequency structures and multi-scale high-frequency boundary details, and align them via learnable attention. We also introduce a Frequency-Guided Boundary Refinement (FGBR) module that extracts boundary prototypes from high-frequency components and refines spatial features. Furthermore, we design a Multi-task Boundary-Guided Decoder (MBGD) to ensure spatial coherence between boundary and semantic predictions. Extensive experiments demonstrate that FreqDINO surpasses state-of-the-art methods with superior achieves remarkable generalization capability. The code is at https://github.com/MingLang-FD/FreqDINO.