TAMISeg: Text-Aligned Multi-scale Medical Image Segmentation with Semantic Encoder Distillation

📅 2026-04-12

📈 Citations: 0

✨ Influential: 0

career value

156K/year

🤖 AI Summary

This work addresses the challenges of medical image segmentation—namely, the scarcity of fine-grained annotations, anatomical complexity, and image degradation—by introducing TAMISeg, a novel framework that integrates clinical text prompts with semantic distillation from a frozen DINOv3 vision teacher model. TAMISeg features a consistency-aware encoder, a semantic distillation module, and a scale-adaptive decoder, collectively reducing reliance on pixel-level annotations while enhancing semantic discriminability and robustness. Through multimodal text–image alignment and strong perturbation-based pretraining, the proposed method achieves state-of-the-art performance across multiple benchmarks, significantly outperforming existing unimodal and multimodal approaches on the Kvasir-SEG, MosMedData+, and QaTa-COV19 datasets.

Technology Category

Application Category

📝 Abstract

Medical image segmentation remains challenging due to limited fine-grained annotations, complex anatomical structures, and image degradation from noise, low contrast, or illumination variation. We propose TAMISeg, a text-guided segmentation framework that incorporates clinical language prompts and semantic distillation as auxiliary semantic cues to enhance visual understanding and reduce reliance on pixel-level fine-grained annotations. TAMISeg integrates three core components: a consistency-aware encoder pretrained with strong perturbations for robust feature extraction, a semantic encoder distillation module with supervision from a frozen DINOv3 teacher to enhance semantic discriminability, and a scale-adaptive decoder that segments anatomical structures across different spatial scales. Experiments on the Kvasir-SEG, MosMedData+, and QaTa-COV19 datasets demonstrate that TAMISeg consistently outperforms existing uni-modal and multi-modal methods in both qualitative and quantitative evaluations. Code will be made publicly available at https://github.com/qczggaoqiang/TAMISeg.

Problem

Research questions and friction points this paper is trying to address.

medical image segmentation

limited annotations

anatomical complexity

image degradation

Innovation

Methods, ideas, or system contributions that make the work stand out.

text-guided segmentation

semantic distillation

multi-scale medical image segmentation

DINOv3