🤖 AI Summary
Weakly supervised semantic segmentation of histopathological tissues—relying solely on image-level labels—suffers from poor pixel-level localization accuracy. To address this challenge, we propose a novel weak-to-strong supervision framework. Our method introduces three key innovations: (1) a mosaic-based image synthesis technique that generates interpretable ground-truth masks using Bézier curves; (2) a synthetic image authenticity filtering module to improve pseudo-label quality; and (3) a CAM-guided self-supervised consistency regularization mechanism to enhance cross-view prediction stability. Evaluated on three public histopathology datasets, our approach achieves state-of-the-art performance in segmentation accuracy and model robustness. It significantly outperforms existing weakly supervised methods without requiring any pixel-level annotations, establishing an efficient and reliable paradigm for annotation-free histopathological image analysis.
📝 Abstract
Tissue semantic segmentation is one of the key tasks in computational pathology. To avoid the expensive and laborious acquisition of pixel-level annotations, a wide range of studies attempt to adopt the class activation map (CAM), a weakly-supervised learning scheme, to achieve pixel-level tissue segmentation. However, CAM-based methods are prone to suffer from under-activation and over-activation issues, leading to poor segmentation performance. To address this problem, we propose a novel weakly-supervised semantic segmentation framework for histopathological images based on image-mixing synthesis and consistency regularization, dubbed HisynSeg. Specifically, synthesized histopathological images with pixel-level masks are generated for fully-supervised model training, where two synthesis strategies are proposed based on Mosaic transformation and B'ezier mask generation. Besides, an image filtering module is developed to guarantee the authenticity of the synthesized images. In order to further avoid the model overfitting to the occasional synthesis artifacts, we additionally propose a novel self-supervised consistency regularization, which enables the real images without segmentation masks to supervise the training of the segmentation model. By integrating the proposed techniques, the HisynSeg framework successfully transforms the weakly-supervised semantic segmentation problem into a fully-supervised one, greatly improving the segmentation accuracy. Experimental results on three datasets prove that the proposed method achieves a state-of-the-art performance. Code is available at https://github.com/Vison307/HisynSeg.