🤖 AI Summary
Computational pathology faces dual challenges of limited pathological image data and severe class imbalance, hindering model generalizability. To address this, we propose a semantic-consistent local patch stitching augmentation method that requires no additional annotation or data acquisition. Our approach segments whole-slide images, performs semantic-aware patch sampling guided by tissue morphology, and stitches patches via spatially aligned composition to generate high-fidelity synthetic samples preserving both structural integrity and diagnostic relevance. Notably, this is the first work to explicitly enforce local patch semantic consistency as a constraint in histopathological image augmentation—thereby jointly maintaining histological morphology fidelity and clinical interpretability. Evaluated on two colorectal cancer datasets with ResNet and DenseNet classifiers, our method achieves average improvements of +3.2% in classification accuracy and +4.1% in AUC over conventional augmentation techniques.
📝 Abstract
Computational pathology, integrating computational methods and digital imaging, has shown to be effective in advancing disease diagnosis and prognosis. In recent years, the development of machine learning and deep learning has greatly bolstered the power of computational pathology. However, there still remains the issue of data scarcity and data imbalance, which can have an adversarial effect on any computational method. In this paper, we introduce an efficient and effective data augmentation strategy to generate new pathology images from the existing pathology images and thus enrich datasets without additional data collection or annotation costs. To evaluate the proposed method, we employed two sets of colorectal cancer datasets and obtained improved classification results, suggesting that the proposed simple approach holds the potential for alleviating the data scarcity and imbalance in computational pathology.