🤖 AI Summary
Thyroid nodule ultrasound image segmentation faces three major challenges: ambiguous boundaries, highly variable nodule sizes, and severe scarcity of annotated data—leading to weak contextual modeling and poor generalization in existing models. To address these, we propose the first semi-supervised multi-task Transformer framework specifically designed for this task. Our method innovatively incorporates anatomical priors of the thyroid gland and jointly optimizes three complementary objectives: nodule segmentation, thyroid gland segmentation, and nodule size estimation. Key technical components include a hierarchical Transformer encoder, semi-supervised pretraining with consistency regularization, local-global feature fusion, and multi-task collaborative optimization—collectively enhancing boundary discrimination and scale robustness. On the TN3K and DDTI benchmarks, our approach achieves state-of-the-art Dice scores and superior cross-dataset generalization, demonstrating strong clinical applicability.
📝 Abstract
Accurate thyroid nodule segmentation in ultrasound images is critical for diagnosis and treatment planning. However, ambiguous boundaries between nodules and surrounding tissues, size variations, and the scarcity of annotated ultrasound data pose significant challenges for automated segmentation. Existing deep learning models struggle to incorporate contextual information from the thyroid gland and generalize effectively across diverse cases. To address these challenges, we propose SSMT-Net, a Semi-Supervised Multi-Task Transformer-based Network that leverages unlabeled data to enhance Transformer-centric encoder feature extraction capability in an initial unsupervised phase. In the supervised phase, the model jointly optimizes nodule segmentation, gland segmentation, and nodule size estimation, integrating both local and global contextual features. Extensive evaluations on the TN3K and DDTI datasets demonstrate that SSMT-Net outperforms state-of-the-art methods, with higher accuracy and robustness, indicating its potential for real-world clinical applications.