MediRound: Multi-Round Entity-Level Reasoning Segmentation in Medical Images

📅 2025-11-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing medical image segmentation methods are predominantly single-task and single-turn, limiting their ability to support clinically relevant, multi-step reasoning for complex entity segmentation. To address this, we propose a novel task—multi-turn entity-level medical reasoning segmentation—and introduce MR-MedSeg, the first large-scale multi-turn medical segmentation dialogue dataset comprising 177,000 dialogues. We design a lightweight Judgment & Correction mechanism to enable cross-turn entity state tracking, dynamic assessment, and error correction. Our method, built upon a text-prompting framework, jointly models dialogue history and updates entity states, effectively mitigating error accumulation. Evaluated on MR-MedSeg, our approach consistently outperforms single-turn referring-expression segmentation baselines. Results demonstrate that multi-turn entity-level reasoning significantly enhances segmentation accuracy and clinical robustness, underscoring its critical role in real-world medical applications.

Technology Category

Application Category

📝 Abstract
Despite the progress in medical image segmentation, most existing methods remain task-specific and lack interactivity. Although recent text-prompt-based segmentation approaches enhance user-driven and reasoning-based segmentation, they remain confined to single-round dialogues and fail to perform multi-round reasoning. In this work, we introduce Multi-Round Entity-Level Medical Reasoning Segmentation (MEMR-Seg), a new task that requires generating segmentation masks through multi-round queries with entity-level reasoning. To support this task, we construct MR-MedSeg, a large-scale dataset of 177K multi-round medical segmentation dialogues, featuring entity-based reasoning across rounds. Furthermore, we propose MediRound, an effective baseline model designed for multi-round medical reasoning segmentation. To mitigate the inherent error propagation in the chain-like pipeline of multi-round segmentation, we introduce a lightweight yet effective Judgment & Correction Mechanism during model inference. Experimental results demonstrate that our method effectively addresses the MEMR-Seg task and outperforms conventional medical referring segmentation methods.
Problem

Research questions and friction points this paper is trying to address.

Addresses lack of interactivity in medical image segmentation methods
Enables multi-round entity-level reasoning for segmentation queries
Solves error propagation in chain-like multi-round segmentation pipelines
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-round entity-level reasoning segmentation
Large-scale multi-round medical dialogue dataset
Lightweight judgment and correction mechanism
🔎 Similar Papers
No similar papers found.
Q
Qinyue Tong
Zhejiang University
Ziqian Lu
Ziqian Lu
Zhejiang University;Zhejiang Sci-Tech University
Zero-Shot LearningMulti-modalLLMContrastive Learning
J
Jun Liu
Zhejiang University
R
Rui Zuo
Zhejiang University
Z
Zheming Lu
Zhejiang University