Hierarchical Text-Guided Brain Tumor Segmentation via Sub-Region-Aware Prompts

📅 2026-03-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of imprecise segmentation of brain tumor sub-regions—namely, the whole tumor, tumor core, and enhancing tumor—due to their inherently ambiguous boundaries. To overcome the limitations of existing methods, we propose TextCSP, a novel framework that, for the first time, integrates anatomical hierarchy with clinical text semantics during the decoding phase. By leveraging hierarchical text guidance, our approach enables coarse-to-fine segmentation refinement. The method incorporates a sub-region-aware soft prompt tuning mechanism and a text-semantic channel modulator, combined with a LoRA-finetuned BioBERT encoder, a soft-cascaded decoder, and channel-wise feature modulation to achieve fine-grained multimodal fusion. Evaluated on the TextBraTS dataset, TextCSP achieves a 1.7% improvement in Dice score and a 6% reduction in HD95, significantly outperforming current state-of-the-art approaches.

Technology Category

Application Category

📝 Abstract
Brain tumor segmentation remains challenging because the three standard sub-regions, i.e., whole tumor (WT), tumor core (TC), and enhancing tumor (ET), often exhibit ambiguous visual boundaries. Integrating radiological description texts with imaging has shown promise. However, most multimodal approaches typically compress a report into a single global text embedding shared across all sub-regions, overlooking their distinct clinical characteristics. We propose TextCSP (text-modulated soft cascade architecture), a hierarchical text-guided framework that builds on the TextBraTS baseline with three novel components: (1) a text-modulated soft cascade decoder that predicts WT->TC->ET in a coarse-to-fine manner consistent with their anatomical containment hierarchy. (2) sub-region-aware prompt tuning, which uses learnable soft prompts with a LoRA-adapted BioBERT encoder to generate specialized text representations tailored for each sub-region; (3) text-semantic channel modulators that convert the aforementioned representations into channel-wise refinement signals, enabling the decoder to emphasize features aligned with clinically described patterns. Experiments on the TextBraTS dataset demonstrate consistent improvements across all sub-regions against state-of-the-art methods by 1.7% and 6% on the main metrics Dice and HD95.
Problem

Research questions and friction points this paper is trying to address.

brain tumor segmentation
sub-region ambiguity
multimodal integration
text-guided segmentation
clinical text representation
Innovation

Methods, ideas, or system contributions that make the work stand out.

hierarchical segmentation
text-guided medical imaging
sub-region-aware prompts
soft cascade decoder
LoRA-adapted BioBERT
🔎 Similar Papers
No similar papers found.