Uncertainty-Aware Vision-Language Segmentation for Medical Imaging

📅 2026-02-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses diagnostic uncertainty in medical image segmentation caused by poor image quality and misalignment between multimodal data by proposing a cross-modal segmentation framework that integrates radiological images with clinical text. The core innovations include a Modality-Decoupled Attention Block (MoDAB) and a lightweight State-Space Mixer (SSMix) to enable efficient vision-language alignment, as well as a Spectral Entropy Uncertainty (SEU) loss that jointly models spatial, spectral, and predictive uncertainties. Evaluated on the QATA-COVID19, MosMed++, and Kvasir-SEG datasets, the proposed method consistently outperforms current state-of-the-art approaches in both segmentation accuracy and computational efficiency.

Technology Category

Application Category

📝 Abstract
We introduce a novel uncertainty-aware multimodal segmentation framework that leverages both radiological images and associated clinical text for precise medical diagnosis. We propose a Modality Decoding Attention Block (MoDAB) with a lightweight State Space Mixer (SSMix) to enable efficient cross-modal fusion and long-range dependency modelling. To guide learning under ambiguity, we propose the Spectral-Entropic Uncertainty (SEU) Loss, which jointly captures spatial overlap, spectral consistency, and predictive uncertainty in a unified objective. In complex clinical circumstances with poor image quality, this formulation improves model reliability. Extensive experiments on various publicly available medical datasets, QATA-COVID19, MosMed++, and Kvasir-SEG, demonstrate that our method achieves superior segmentation performance while being significantly more computationally efficient than existing State-of-the-Art (SoTA) approaches. Our results highlight the importance of incorporating uncertainty modelling and structured modality alignment in vision-language medical segmentation tasks. Code: https://github.com/arya-domain/UA-VLS
Problem

Research questions and friction points this paper is trying to address.

uncertainty-aware
vision-language segmentation
medical imaging
multimodal fusion
predictive uncertainty
Innovation

Methods, ideas, or system contributions that make the work stand out.

uncertainty-aware
vision-language segmentation
cross-modal fusion
State Space Mixer
Spectral-Entropic Uncertainty
🔎 Similar Papers
No similar papers found.