CORE-Seg: Reasoning-Driven Segmentation for Complex Lesions via Reinforcement Learning

📅 2026-03-06

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

Existing lesion segmentation methods struggle to simultaneously achieve pixel-level accuracy and cognitive reasoning capabilities, often lacking interpretability. To address this limitation, this work proposes CORE-Seg, an end-to-end framework that integrates semantic-guided prompt adapters with chain-of-thought reasoning to enable joint optimization of segmentation and logical inference. We introduce ComLesion-14K, the first benchmark for complex lesion segmentation with chain-of-thought reasoning, and design an adaptive dual-granularity reward mechanism to alleviate reward sparsity in reinforcement learning. Through a progressive training strategy evolving from supervised fine-tuning (SFT) to group relative policy optimization (GRPO), the model achieves a Dice score of 37.06%, outperforming the next-best method by 14.89% and reducing failure rate to 18.42%.

Technology Category

Application Category

📝 Abstract

Medical image segmentation is undergoing a paradigm shift from conventional visual pattern matching to cognitive reasoning analysis. Although Multimodal Large Language Models (MLLMs) have shown promise in integrating linguistic and visual knowledge, significant gaps remain: existing general MLLMs possess broad common sense but lack the specialized visual reasoning required for complex lesions, whereas traditional segmentation models excel at pixel-level segmentation but lack logical interpretability. In this paper, we introduce ComLesion-14K, the first diverse Chain-of-Thought (CoT) benchmark for reasoning-driven complex lesion segmentation. To accomplish this task, we propose CORE-Seg, an end-to-end framework integrating reasoning with segmentation through a Semantic-Guided Prompt Adapter. We design a progressive training strategy from SFT to GRPO, equipped with an adaptive dual-granularity reward mechanism to mitigate reward sparsity. Our Method achieves state-of-the-art results with a mean Dice of 37.06\% (14.89\% higher than the second-best baseline), while reducing the failure rate to 18.42\%. Project Page: https://xyxl024.github.io/CORE-Seg.github.io/

Problem

Research questions and friction points this paper is trying to address.

medical image segmentation

complex lesions

visual reasoning

logical interpretability

multimodal large language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

reasoning-driven segmentation

Chain-of-Thought

Semantic-Guided Prompt Adapter