MedSeg-R: Medical Image Segmentation with Clinical Reasoning

📅 2025-06-23

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

Medical image segmentation faces challenges including anatomical structure overlap, ambiguous boundaries, and difficulty detecting small lesions. Existing methods suffer from limited generalizability and fine-grained localization accuracy due to the absence of interpretable semantic priors. This paper proposes a clinically inspired two-stage lightweight framework: for the first time, it parses radiology reports into structured semantic priors—such as location, texture, and shape—and embeds them early in the segmentation pipeline. A Transformer-based module fuses these priors to modulate the SAM backbone, synergistically integrating spatial attention, dynamic convolution, and deformable sampling for joint perceptual–cognitive modeling. The method is plug-and-play and compatible with diverse SAM-based systems. Extensive experiments demonstrate significant improvements over state-of-the-art methods across multiple benchmarks, particularly yielding substantial Dice score gains in overlapping and boundary-ambiguous regions.

Technology Category

Application Category

📝 Abstract

Medical image segmentation is challenging due to overlapping anatomies with ambiguous boundaries and a severe imbalance between the foreground and background classes, which particularly affects the delineation of small lesions. Existing methods, including encoder-decoder networks and prompt-driven variants of the Segment Anything Model (SAM), rely heavily on local cues or user prompts and lack integrated semantic priors, thus failing to generalize well to low-contrast or overlapping targets. To address these issues, we propose MedSeg-R, a lightweight, dual-stage framework inspired by inspired by clinical reasoning. Its cognitive stage interprets medical report into structured semantic priors (location, texture, shape), which are fused via transformer block. In the perceptual stage, these priors modulate the SAM backbone: spatial attention highlights likely lesion regions, dynamic convolution adapts feature filters to expected textures, and deformable sampling refines spatial support. By embedding this fine-grained guidance early, MedSeg-R disentangles inter-class confusion and amplifies minority-class cues, greatly improving sensitivity to small lesions. In challenging benchmarks, MedSeg-R produces large Dice improvements in overlapping and ambiguous structures, demonstrating plug-and-play compatibility with SAM-based systems.

Problem

Research questions and friction points this paper is trying to address.

Medical image segmentation struggles with ambiguous boundaries and class imbalance

Existing methods lack semantic priors for low-contrast or overlapping targets

Need for improved sensitivity to small lesions in clinical settings

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cognitive stage interprets medical reports into semantic priors

Transformer block fuses structured semantic priors

Perceptual stage modulates SAM backbone with spatial attention

🔎 Similar Papers

No similar papers found.