🤖 AI Summary
Medical image segmentation requires joint modeling of diverse targets—including organs, anatomical structures, and tumor microenvironments—yet existing methods typically treat these tasks in isolation, neglecting inter-task semantic and spatial dependencies, thereby limiting performance. To address this, we propose the first bidirectional collaborative learning framework unifying semantic and instance segmentation. Our approach introduces a mutual prompting guidance mechanism, a spatiotemporal prompt encoder (STP-Encoder) to model long-range cross-regional constraints, and a multi-task collaborative decoder (MTC-Decoder) enabling cross-task feature interaction and context-aware consistency optimization. Evaluated on CT dental anatomy, histopathological tissue, and nuclear segmentation benchmarks, our method achieves state-of-the-art performance across semantic, instance, and panoptic segmentation metrics. It significantly enhances holistic medical image understanding under multi-task coupling.
📝 Abstract
Medical image analysis is critical yet challenged by the need of jointly segmenting organs or tissues, and numerous instances for anatomical structures and tumor microenvironment analysis. Existing studies typically formulated different segmentation tasks in isolation, which overlooks the fundamental interdependencies between these tasks, leading to suboptimal segmentation performance and insufficient medical image understanding. To address this issue, we propose a Co-Seg++ framework for versatile medical segmentation. Specifically, we introduce a novel co-segmentation paradigm, allowing semantic and instance segmentation tasks to mutually enhance each other. We first devise a spatio-temporal prompt encoder (STP-Encoder) to capture long-range spatial and temporal relationships between segmentation regions and image embeddings as prior spatial constraints. Moreover, we devise a multi-task collaborative decoder (MTC-Decoder) that leverages cross-guidance to strengthen the contextual consistency of both tasks, jointly computing semantic and instance segmentation masks. Extensive experiments on diverse CT and histopathology datasets demonstrate that the proposed Co-Seg++ outperforms state-of-the-arts in the semantic, instance, and panoptic segmentation of dental anatomical structures, histopathology tissues, and nuclei instances. The source code is available at https://github.com/xq141839/Co-Seg-Plus.