CORE-Seg: Reasoning-Driven Segmentation for Complex Lesions via Reinforcement Learning

📅 2026-03-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing lesion segmentation methods struggle to simultaneously achieve pixel-level accuracy and cognitive reasoning capabilities, often lacking interpretability. To address this limitation, this work proposes CORE-Seg, an end-to-end framework that integrates semantic-guided prompt adapters with chain-of-thought reasoning to enable joint optimization of segmentation and logical inference. We introduce ComLesion-14K, the first benchmark for complex lesion segmentation with chain-of-thought reasoning, and design an adaptive dual-granularity reward mechanism to alleviate reward sparsity in reinforcement learning. Through a progressive training strategy evolving from supervised fine-tuning (SFT) to group relative policy optimization (GRPO), the model achieves a Dice score of 37.06%, outperforming the next-best method by 14.89% and reducing failure rate to 18.42%.

Technology Category

Application Category

📝 Abstract
Medical image segmentation is undergoing a paradigm shift from conventional visual pattern matching to cognitive reasoning analysis. Although Multimodal Large Language Models (MLLMs) have shown promise in integrating linguistic and visual knowledge, significant gaps remain: existing general MLLMs possess broad common sense but lack the specialized visual reasoning required for complex lesions, whereas traditional segmentation models excel at pixel-level segmentation but lack logical interpretability. In this paper, we introduce ComLesion-14K, the first diverse Chain-of-Thought (CoT) benchmark for reasoning-driven complex lesion segmentation. To accomplish this task, we propose CORE-Seg, an end-to-end framework integrating reasoning with segmentation through a Semantic-Guided Prompt Adapter. We design a progressive training strategy from SFT to GRPO, equipped with an adaptive dual-granularity reward mechanism to mitigate reward sparsity. Our Method achieves state-of-the-art results with a mean Dice of 37.06\% (14.89\% higher than the second-best baseline), while reducing the failure rate to 18.42\%. Project Page: https://xyxl024.github.io/CORE-Seg.github.io/
Problem

Research questions and friction points this paper is trying to address.

medical image segmentation
complex lesions
visual reasoning
logical interpretability
multimodal large language models
Innovation

Methods, ideas, or system contributions that make the work stand out.

reasoning-driven segmentation
Chain-of-Thought
Semantic-Guided Prompt Adapter
reinforcement learning
multimodal large language models
🔎 Similar Papers
No similar papers found.
Yuxin Xie
Yuxin Xie
Peking University
audiomllm
Yuming Chen
Yuming Chen
Lanzhou University, Nankai University
computer visionobject detection
Y
Yishan Yang
School of Computer Science and Engineering, Southeast University, Nanjing, 211189, China
Y
Yi Zhou
School of Computer Science and Engineering, Southeast University, Nanjing, 211189, China
Tao Zhou
Tao Zhou
Nanjing University of Science and Technology, IIAI, UNC, SJTU
Computer visionmachine learningmedical image analysisAI in Healthcare
Zhen Zhao
Zhen Zhao
East China Normal University
J
Jiacheng Liu
Zhongda Hospital, Southeast University, Nanjing, 210009, China
Huazhu Fu
Huazhu Fu
Principal Scientist, IHPC, A*STAR
Medical Image AnalysisAI for HealthcareMedical AITrustworthy AI