🤖 AI Summary
To address the challenges of high pseudo-label noise and rigid loss functions in weakly supervised segmentation of thyroid nodule ultrasound images—leading to poor boundary delineation and localization accuracy—this paper proposes a high-confidence pseudo-label generation framework coupled with a rational multi-objective loss. First, spatially aligned bounding-box, foreground, and background pseudo-labels are generated by jointly leveraging four-point annotations and MedSAM-based prompt-guided inference, substantially improving initial label confidence. Second, a tripartite synergistic loss mechanism is introduced: spatial alignment loss, foreground-background contrastive loss, and prototype correlation loss—jointly enforcing topological priors and enhancing multi-scale discriminative feature learning. The method achieves state-of-the-art performance on TN3K and DDTI benchmarks, with significant improvements in nodule localization accuracy and boundary IoU. Code is publicly available.
📝 Abstract
Weakly supervised segmentation methods can delineate thyroid nodules in ultrasound images efficiently using training data with coarse labels, but suffer from: 1) low-confidence pseudo-labels that follow topological priors, introducing significant label noise, and 2) low-rationality loss functions that rigidly compare segmentation with labels, ignoring discriminative information for nodules with diverse and complex shapes. To solve these issues, we clarify the objective and references for weakly supervised ultrasound image segmentation, presenting a framework with high-confidence pseudo-labels to represent topological and anatomical information and high-rationality losses to capture multi-level discriminative features. Specifically, we fuse geometric transformations of four-point annotations and MedSAM model results prompted by specific annotations to generate high-confidence box, foreground, and background labels. Our high-rationality learning strategy includes: 1) Alignment loss measuring spatial consistency between segmentation and box label, and topological continuity within the foreground label, guiding the network to perceive nodule location; 2) Contrastive loss pulling features from labeled foreground regions while pushing features from labeled foreground and background regions, guiding the network to learn nodule and background feature distribution; 3) Prototype correlation loss measuring consistency between correlation maps derived by comparing features with foreground and background prototypes, refining uncertain regions to accurate nodule edges. Experimental results show that our method achieves state-of-the-art performance on the TN3K and DDTI datasets. The code is available at https://github.com/bluehenglee/MLI-MSC.